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Preface 



Bellman has called matrix theory 'the arithmetic of higher mathematics.' Under the influence 
of Bellman and Kalman engineers and scientists have found in matrix theory a language for repre- 
senting and analyzing multivariable systems. Our goal in these notes is to demonstrate the role of 
matrices in the modeling of physical systems and the power of matrix theory in the analysis and 
synthesis of such systems. 

Beginning with modeling of structures in static equilibrium we focus on the linear nature of the 
relationship between relevant state variables and express these relationships as simple matrix-vector 
products. For example, the voltage drops across the resistors in a network are linear combinations 
of the potentials at each end of each resistor. Similarly the current through each resistor is as- 
sumed to be a linear function of the voltage drop across it. And, finally, at equilibrium, a linear 
combination (in minus out) of the currents must vanish at every node in the network. In short, the 
vector of currents is a linear transformation of the vector of voltage drops which is itself a linear 
transformation of the vector of potentials. A linear transformation of n numbers into m numbers is 
accomplished by multiplying the vector of n numbers by an m-by-n matrix. Once we have learned 
to spot the ubiquitous matrix-vector product we move on to the analysis of the resulting linear 
systems of equations. We accomplish this by stretching your knowledge of three-dimensional space. 
That is, we ask what does it mean that the m-by-n matrix X transforms R n (real n-dimensional 
space) into R m ? We shall visualize this transformation by splitting both R n and R m each into two 
smaller spaces between which the given X behaves in very manageable ways. An understanding of 
this splitting of the ambient spaces into the so called four fundamental subspaces of X permits one 
to answer virtually every question that may arise in the study of structures in static equilibrium. 

In the second half of the notes we argue that matrix methods are equally effective in the modeling 
and analysis of dynamical systems. Although our modeling methodology adapts easily to dynamical 
problems we shall see, with respect to analysis, that rather than splitting the ambient spaces we 
shall be better served by splitting X itself. The process is analogous to decomposing a complicated 
signal into a sum of simple harmonics oscillating at the natural frequencies of the structure under 
investigation. For we shall see that (most) matrices may be written as weighted sums of matrices of 
very special type. The weights are the eigenvalues, or natural frequencies, of the matrix while the 
component matrices are projections composed from simple products of eigenvectors. Our approach 
to the eigendecomposition of matrices requires a brief exposure to the beautiful field of Complex 
Variables. This foray has the added benefit of permitting us a more careful study of the Laplace 
Transform, another fundamental tool in the study of dynamical systems. 
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1. Matrix Methods for Electrical Systems 



1.1. Nerve Cables and the Strang Quartet 

We wish to confirm, by example, the prefatory claim that matrix algebra is a useful means of 
organizing (stating and solving) multivariable problems. In our first such example we investigate 
the response of a neuron to a constant current stimulus. Ideally, a neuron is simply a cylinder of 
radius a and length £ that conducts electricity both along its length and across its lateral membrane. 
Though we shall, in subsequent chapters, delve more deeply into the biophysics, here, in our first 
outing, we stick to its purely resistive properties. Theses are expressed via two quantities: pi, the 
resistivity, in Qcm, of the cytoplasm that fills the cell, and p m , the resistivity in Vtcm 2 of the cell's 
lateral membrane. 




Figure 1.1. A3 compartment model of a neuron. 



Although current surely varies from point to point along the neuron it is hoped that these variations 
are regular enough to be captured by a multicompartment model. By that we mean that we choose 
a number iV and divide the neuron into iV segments each of length £/N. Denoting a segment's axial 
resistance by 

R PM* 

tii — 5— 

7TGr 

and membrane resistance by 

p _ P m 
m ~ 2<na£/N 

we arrive at the lumped circuit model of Figure 1.1. For a neuron in culture we may assume a 
constant extracellular potential, e.g., zero. We accomplish this by connecting and grounding the 
extracellular nodes, see Figure 1.2. 
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Figure 1.2. A rudimentary neuronal circuit model. 

This figure also incorporates the exogenous disturbance, a current stimulus between ground and the 
left end of the neuron. Our immediate goal is to compute the resulting currents through each resistor 
and the potential at each of the nodes. Our long-range goal is to provide a modeling methodology 
that can be used across the engineering and science disciplines. As an aid to computing the desired 
quantities we give them names. With respect to Figure 1.3 we label the vector of potentials 



x 



x 3 



and vector of currents y 



(yi\ 

2/2 
2/3 
2/4 
V5 

w 



We have also (arbitrarily) assigned directions to the currents as a graphical aid in the consistent 
application of the basic circuit laws. 

R. Ri R. 
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Figure 1.3 The fully dressed circuit model. 

We incorporate the circuit laws in a modeling methodology that takes the form of a Strang 
Quartet after ?, 

(51) Express the voltage drops via e = —Ax. 

(52) Express Ohm's Law via y = Ge. 

(53) Express Kirchhoff's Current Law via A T y = —f. 

(54) Combine the above into A T GAx = f. 

The A in (SI) is the node-edge adjacency matrix - it encodes the network's connectivity. The G 
in (S2) is the diagonal matrix of edge conductances - it encodes the physics of the network. The / 
in (S3) is the vector of current sources - it encodes the network's stimuli. The culminating A T GA 
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in (S4) is the symmetric matrix whose inverse, when applied to /, reveals the vector of potentials, 
x. In order to make these ideas our own we must work many, many examples. 



1.2. Example 1 



With respect to the circuit of figure 1.3, in accordance with step (SI), we express the six potentials 
differences (always tail minus head) 

ei = x 1 - x 2 
e 2 = x 2 
e 3 = x 2 - x 3 
e 4 = x 3 

?5 = %3 - x 4 
e 6 = x A 

Such long, tedious lists cry out for matrix representation, to wit 



e = —Ax where A = 



Step (S2), Ohm's law, states that the current along an edge is equal to the potential drop across 
the edge divided by the resistance of the edge. In our case, 
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or, in matrix notation, 
where 



ej/Ri, j = 1,3,5 and yj = ej/R m , j = 2,4,6 
y = Ge 



n/Ri 
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Step (S3), Kirchhoff's Current Law, states that the sum of the currents into each node must be 
zero. In our case 

- V\ = 
V\ - V2 - 2/3 = 

1/3-2/4-2/5 = 

2/5-2/6 = 

or, in matrix terms 

By = -f 
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where 



B 
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1 -v 
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Turning back the page we recognize in B the transpose of A. Calling it such, we recall our main 

steps 

e = — Ax, y = Ge, and A T y = —f. 



On substitution of the first two into the third we arrive, in accordance with (S4), at 



A T GAx = f. 



(1.1) 



This is a linear system of four simultaneous equations for the 4 unknown potentials, x\ through x±. 
As you may know, the system (1.1) may have either 1, 0, or infinitely many solutions, depending on 
/ and A T GA. We shall devote chapters 3 and 4 to a careful analysis of the previous sentence. For 
now, we simply invoke the Matlab backslash command and arrive at the response depicted in 
Figure 1.4. 




0.1 0.2 0. 



z (cm) 



Figure 1.4. Results of a 16 compartment simulation, cabl 



m. 



Once the structure of the constituents in the fundamental system (1.1) is determined it is an easy 
matter to implement it, as we have done in cabl . m, for an arbitrary number of compartments. In 
Figure 1.4 we see that the stimulus at the neuron's left end produces a depolarization there that 
then attenuates with distance from the site of stimulation. 

1.3. Example 2 

We have seen in the previous section how a current source may produce a potential difference 
across a neuron's membrane. We note that, even in the absence of electrical stimuli, there is always 
a difference in potential between the inside and outside of a living cell. In fact, this difference is 
one of the biologist's definition of 'living.' Life is maintained by the fact that the neuron's interior 
is rich (relative to the cell's exterior) in potassium ions and poor in sodium and chloride ions. 
These concentration differences establish a resting potential difference, E m , across the cell's lateral 
membrane. The modified circuit diagram is given in Figure 1.5. 
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Figure 1.5 Circuit model with batteries associated with the rest potential. 

The convention is that the potential difference across the battery is E m . As the bottom terminal 
of each battery is grounded it follows that the potential at the top of each battery is E m . Revisiting 
steps (Sl-4) of the Strang Quartet we note that in (SI) the even numbered voltage drops are now 

e 2 = x 2 - E m , e 4 = x 3 - E rn and e 6 = a; 4 - E m . 

We accommodate such things by generalizing (SI) to 

(SI') Express the voltage drops as e = b — Ax where b is the vector that encodes the batteries. 
No changes are necessary for (S2) and (S3). The final step now reads, 
(S4') Combine (SI'), (S2) and (S3) to produce 

(1.2) 

This is the general form for a resistor network driven by current sources and batteries. 
Returning to Figure 1.5 we note that 

b = -E m [0 1 1 if and A T Gb = (E m /R m )[0 1 1 1] T 

To build and solve (1.2) requires only minor changes to our old code. The new program is called 
cab2 .m and results of its use are indicated in Figure 1.6. 

-58 - 
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-60 
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-64 
-65 
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z (cm) 

Figure 1.6. Results of a 16 compartment simulation with batteries, E m = —70 mV. cab2 .m 
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A 1 GAx = A 1 Gb + f. 




1.4. Exercises 



1. In order to refresh your matrix- vector multiply skills please calculate, by hand, the product 
A T GA in the 3 compartment case and write out the 4 equations in (1.1). The second equation 
should read 

(-an + 2x 2 - x 3 )/Ri + x 2 /R m = 0. (1.3) 

2. We began our discussion with the 'hope' that a multicompartment model could indeed ade- 
quately capture the neuron's true potential and current profiles. In order to check this one 
should run cabl . m with increasing values of N until one can no longer detect changes in the 
computed potentials. 

(a) Please run cabl .m with N = 8, 16, 32 and 64. Plot all of the potentials on the same (use 
hold) graph, using different line types for each. (You may wish to alter cabl . m so that it 
accepts N as an argument). 

Let us now interpret this convergence. The main observation is that the difference equation, 
(1.3), approaches a differential equation. We can see this by noting that 

dz = £/N 

acts as a spatial 'step' size and that Xk, the potential at (k — l)dz, is approximately the value 
of the true potential at (k — l)dz. In a slight abuse of notation, we denote the latter 

x((k-l)dz). 

Applying these conventions to (1.3) and recalling the definitions of Ri and R m we see (1.3) 
become 

na 2 —x(0) + 2x{dz) — x(2dz) 27radz 

" I ocidiZ) — 

Pi dz p m 

or, after multiplying through by p m /(nadz), 

ap m —x(0) + 2x(dz) — x(2dz) 



Pi dz 2 



+ 2x(dz) = 0. 



We note that a similar equation holds at each node (save the ends) and that as iV — > oo and 
therefore dz — > we arrive at 

d 2 x(z) 2pi , . , 
-tV - -£-x(z) = 0. 1.4 
dz 1 ap m 

(b) With p = 2pi/(ap m ) show that 

x(z) = a sinh( v //iz) + (5 cosh( v //Iz) (1.5) 
satisfies (1.4) regardless of a and f3. 

We shall determine a and (5 by paying attention to the ends of the neuron. At the near end 

na 2 x(0) — x(dz) 



we find 



which, as dz — > becomes 



Pi dz 

dx(0) piio 
dz na 2 
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= la, 



(1.6) 



At the far end, we interpret the condition that no axial current may leave the last node to 
mean 

^2=0. (1.7) 

dz 

(c) Substitute (1.5) into (1.6) and (1.7) and solve for a and (3 and write out the final x(z). 

(d) Substitute into x the £,a,pi and p m values used in cabl . m, plot the resulting function 
(using, e.g., ezplot) and compare this to the plot achieved in part (a). 
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2. Matrix Methods for Mechanical Systems 



2.1. Elastic Fibers and the Strang Quartet 

We connect 3 masses (nodes) with four springs (fibers) between two immobile walls, as in Fig- 
ure 2.1, and apply forces at the masses and measure the associated displacement. 




Figure 2.1. A fiber chain. 

We suppose that a horizontal force, is applied to each rrij, and produces a horizontal dis- 
placement Xj, with the sign convention that rightward means positive. The bars at the ends of 
the figure indicate rigid supports incapable of movement. The kj denote the respective spring stiff- 
nesses. Regarding units, we measure fj in Newtons (N) and Xj in meters (m) and so stiffness, 
kj, is measured in (N/m). In fact each stiffness is a parameter composed of both 'material' and 
'geometric' quantities. In particular, 

kj = Y >"> (2.1) 



where Yj is the fiber's Young's modulus (N/m 2 ), aj is the fiber's cross-sectional area (m 2 ) and Lj 
is the fiber's (reference) length (m). 

The analog of potential difference is here elongation. If ej denotes the elongation of the jth 
spring then naturally, 



e 1 — x 1 , e 2 = x 2 -x 1 , e 3 = x 3 -x 2 , and e 4 = —x 3 , 



or, in matrix terms, 



e = Ax where A 



( 1 
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1 
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-v 



We note that Cj is positive when the spring is stretched and negative when compressed. The analog 
of Ohm's Law is here Hooke's Law: the restoring force in a spring is proportional to its elongation. 
We call this constant of proportionality the stiffness, kj, of the spring, and denote the restoring 
force by yj. Hooke's Law then reads, yj = kjtj, or, in matrix terms 



Ke where K 



The analog of Kirchhoff's Current Law is here typically called 'force balance.' More precisely, 
equilibrium is synonymous with the fact that the net force acting on each mass must vanish. In 
symbols, 

V\ ~ V2 ~ fi = 0, y 2 - 2/3 - /2 = 0, and y 3 - y 4 - f 3 = 0, 



(h 













k 2 














k 3 













k 4 J 
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or, in matrix terms 



By = f where / 




and B 



(I 





i 






As is the previous section we recognize in B the transpose of A. Gathering our three important 
steps 

e = Ax 
y = Ke 
A T y = f 

we arrive, via direct substitution, at an equation for x. Namely 



A T y = f A T Ke = f A T KAx = f. 



Assembling A T KA we arrive at the final system 

'k\ + k 2 —k 2 
-k 2 k 2 + k 3 -k 3 
—k 3 k 3 + k±. 



xA 




x 2 






V/3 



(2.2) 



Although Matlab solves such systems with ease our aim here is to develop a deeper understanding 
of Gaussian Elimination and so we proceed by hand. This aim is motivated by a number of 
important considerations. First, not all linear systems have unique solutions. A careful look at 
Gaussian Elimination will provide the general framework for not only classifying those systems 
that possess unique solutions but also for providing detailed diagnoses of those systems that lack 
solutions or possess too many. 

In Gaussian Elimination one first uses linear combinations of preceding rows to eliminate nonzeros 
below the main diagonal and then solves the resulting triangular system via back-substitution. To 
firm up our understanding let us take up the case where each kj = 1 and so (2.2) takes the form 



(2.3) 




Xl\ 




x 2 




X3/ 





We eliminate the (2, 1) (row 2, column 1) element by implementing 



new row 2 



bringing 




old row 2 + -row 1, 



fl 



(2.4) 



h + A/2 

h 



We eliminate the current (3, 2) element by implementing 



new row 3 



old row 3 + -row 2, 



(2.5) 



bringing the upper-triangular system 

Ux = g, (2.6) 



or, more precisely, 

G 

One now simply reads off 

This in turn permits the solution of the second equation 

x 2 = 2(x 3 + h + /i/2)/3 = {h + 2/ 2 + / 3 )/2, 

and, in turn, 

xi = (x 2 + /i)/2 = (3/i + 2/ 2 + / 3 )/4. 

One must say that Gaussian Elimination has succeeded here. For, regardless of the actual elements 
of / we have produced an x for which A T KAx = f. 

Although Gaussian Elimination, remains the most efficient means for solving systems of the form 
Sx — f it pays, at times, to consider alternate means. At the algebraic level, suppose that there 
exists a matrix that 'undoes' multiplication by S in the sense that multiplication by 2 _1 undoes 
multiplication by 2. The matrix analog of 2 _1 2 = 1 is 

S^S = I 

where I denotes the identity matrix (all zeros except the ones on the diagonal). We call S* -1 "the 
inverse of S" or "S inverse" for short. Its value stems from watching what happens when it is 
applied to each side of Sx = f. Namely, 

Sx = f S^Sx = S- 1 / =>Ix = S- 1 / ^x = S- 1 /. 

Hence, to solve Sx — f for x it suffices to multiply / by the inverse of S. Let us now consider how 
one goes about computing S -1 . In general this takes a little more than twice the work of Gaussian 
Elimination, for we interpret 

ss- 1 = I 

as n (the size of S) applications of Gaussian elimination, with / running through n columns of 
the identity matrix. The bundling of these n applications into one is known as the Gauss- Jordan 
method. Let us demonstrate it on the S appearing in (2.3). We first augment S with I. 

2 -1 | 1 0\ 
-1 2 -10 10 
-1 2 |001/ 

We then eliminate down, being careful to address each of the 3 / vectors. This produces 

2 -1 | 1 0\ 

o 3/2 -l ; 1/2 1 

4/3 | 1/3 2/3 1/ 



-1 




/Xi 


3/2 









4/3/ 


\x 3 



fl 

h + A/2 
j3 + 2/ 2 /3 + A/3, 



(2.7) 



ar 3 = (/i + 2/ 2 + 3/ 3 )/4. 
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1 V3 
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Now, rather than simple back-substitution we instead eliminate up. Eliminating first the (2, 3) 
element we find 

'l -1 
3/2 
K 4/3 

Now eliminating the (1,2) element we achieve 

'2 
3/2 
K 4/3 

In the final step we scale each row in order that the matrix on the left takes on the form of the 
identity. This requires that we multiply row 1 by 1/2, row 2 by 3/2 and row 3 by 3/4, with the 
result 

'1 3/4 1/2 1/4 N 
1 | 1/2 1 1/2 
K 1 1/4 1/2 3/4, 

Now in this transformation of S into I we have, ipso facto, transformed / to S^ 1 , i.e., the matrix 
that appears on the right upon applying the method of Gauss-Jordan is the inverse of the matrix 
that began on the left. In this case, 

/3/4 1/2 1/4 N 
S- 1 =1/2 1 1/2 
\l/4 1/2 3/4, 

One should check that S _1 f indeed coincides with the x computed above. 

Not all matrices possess inverses. Those that do are called invertible or nonsingular. For 
example 

T 2 N 
is singular. 

Some matrices can be inverted by inspection. An important class of such matrices is in fact 
latent in the process of Gaussian Elimination itself. To begin, we build the elimination matrix that 
enacts the elementary row operation spelled out in (2.4), 

E 1 

Do you 'see' that this matrix (when applied from the left to S) leaves rows 1 and 3 unsullied but 
adds half of row one to two? This ought to be 'undone' by simply subtracting half of row 1 from 
row two, i.e., by application of 

/ 1 N 
E^ 1 = -1/2 1 
V 1 

Please confirm that E^Ei is indeed I. Similarly, the matrix analogs of (2.5) and its undoing are 

/l 0\ 
Ei = I 1 I and E^ 1 
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Again, please confirm that E 2 E 2 = I. Now we may express the reduction of S to U (recall (2.6)) 

as 

E 2 E 1 S = U 

and the subsequent reconstitution by 

/ 1 

S = LU, where L = E^E 2 l =1-1/2 1 

\ -2/3 1 

One speaks of this representation as the LU decomposition of S. We have just observed that the 
inverse of a product is the product of the inverses in reverse order. Do you agree that S* -1 = U~ 1 L~ 1 1 
And what do you think of the statement S^ 1 = A~ 1 K~ 1 (A T )~ 1 1 

LU decomposition is the preferred method of solution for the large linear systems that occur in 
practice. The decomposition is implemented in Matlab as 

[L U] = lu(S); 

and in fact lies at the heart of Matlab 's blackslash command. To diagram its use, we write Sx = f 
as LUx = f and recognize that the latter is nothing more than a pair of triangular problems: 

Lc = f and Ux = c, 

that may be solved by forward and backward substitution respectively. This representation achieves 
its greatest advantage when one is asked to solve Sx = f over a large class of / vectors. For example, 
if we wish to steadily increase the force, f 2 , on mass 2, and track the resulting displacement we 
would be well served by 

[L,U] = lu(S) ; 
f = [1 1 1]; 
for j=l:100, 

f(2) = f(2) + j/100; 

x = U \ (L \ f ) ; 

plot (x, ' o' ) 

end 

You are correct in pointing out that we could have also just precomputed the inverse of S and then 
sequentially applied it in our for loop. The use of the inverse is, in general, considerably more costly 
in terms of both memory and operation counts. The exercises will give you a chance to see this for 
yourself. 

2.2. A Small Planar Network 

We move from uni-axial to biaxial elastic nets by first considering the swing in Figure 2.2. 



) 
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Figure 2.2. A simple swing. 

We denote by x\ and x 2 the respective horizontal and vertical displacements of m\ (positive 
is right and down). Similarly, f\ and f 2 will denote the associated components of force. The 
corresponding displacements and forces at m 2 will be denoted by x%, £4 and fs, f'4. In computing 
the elongations of the three springs we shall make reference to their unstretched lengths, Li,L 2 , 
and L 3 . 

Now, if spring 1 connects (0, — Li) to (0,0) when at rest and (0, —L-i) to (x 1: x 2 ) when stretched 
then its elongation is simply 

e 1 = ^xj + (x 2 + L i y-L 1 . (2.8) 

The price one pays for moving to higher dimensions is that lengths are now expressed in terms of 
square roots. The upshot is that the elongations are not linear combinations of the end displacements 
as they were in the uni-axial case. If we presume however that the loads and stiffnesses are matched 
in the sense that the displacements are small compared with the original lengths then we may 
effectively ignore the nonlinear contribution in (2.8). In order to make this precise we need only 
recall the Taylor development of y/1 +t about t — 0, i.e., 

y/l + t = 1 + t/2 + 0(t 2 ) 

where the latter term signifies the remainder. With regard to e 1 this allows 



ei = \jx\ + x\ + 2x 2 L x + L\ - L x 

= L 1 yjl + {x\ + xl)/L\ + 2x 2 /L x - L x 

= Li + (x\ + x 2 2 )/{2L 1 ) +x 2 + LxO(((^ + xl)/L\ + 2x 2 /L 1 f) - L x 
= x 2 + {x\ + x 2 2 )/(2L 1 ) + L 1 0(((^ + x 2 2 )/L\ + 2x 2 /L 1 ) 2 ). 

If we now assume that 

(x\ + x 2 )/(2Li) is small compared to x 2 (2-9) 

then, as the O term is even smaller, we may neglect all but the first terms in the above and so 
arrive at 

ei = x 2 . 

To take a concrete example, if L\ is one meter and x\ and x 2 are each one centimeter than x 2 is 
one hundred times {x\ + x 2 )/(2Li). 
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With regard to the second spring, arguing as above, its elongation is (approximately) its stretch 
along its initial direction. As its initial direction is horizontal, its elongation is just the difference 
of the respective horizontal end displacements, namely, 

e 2 = x 3 - x\. 

Finally, the elongation of the third spring is (approximately) the difference of its respective vertical 
end displacements, i.e., 

e 3 = x 4 . 

We encode these three elongations in 



e = Ax where A 



1 (T 
-10 10 
1 



Hooke's law is an elemental piece of physics and is not perturbed by our leap from uni-axial to 
biaxial structures. The upshot is that the restoring force in each spring is still proportional to its 
elongation, i.e., yj = kjej where kj is the stiffness of the jth spring. In matrix terms, 



y = Ke where K 



ki 
k 2 
k 3i 



Balancing horizontal and vertical forces at m 1 brings 

-J/2 ~ fi = and y 1 - f 2 = 0, 
while balancing horizontal and vertical forces at m 2 brings 

1/2-/3 = and y 3 - / 4 = 0. 



We assemble these into 
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By = f where B 



and recognize, as expected, that B is nothing more than A T . Putting the pieces together, we find 
that x must satisfy Sx = f where 



S = A T KA 
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Applying one step of Gaussian Elimination brings 
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and back substitution delivers 

x A = f 4 /k 3 , 
= fi + fc, 
x 2 = /2A1, 
x 1 -x 3 = fi/k 2 . 

The second of these is remarkable in that it contains no components of x. Instead, it provides a 
condition on /. In mechanical terms, it states that there can be no equilibrium unless the horizontal 
forces on the two masses are equal and opposite. Of course one could have observed this directly 
from the layout of the truss. In modern, three-dimensional structures with thousands of members 
meant to shelter or convey humans one should not however be satisfied with the 'visual' integrity of 
the structure. In particular, one desires a detailed description of all loads that can, and, especially, 
all loads that can not, be equilibrated by the proposed truss. In algebraic terms, given a matrix 
S one desires a characterization of (1) all those / for which Sx = f possesses a solution and (2) 
all those / for which Sx = f does not possess a solution. We provide such a characterization in 
Chapter 3 in our discussion of the column space of a matrix. 

Supposing now that fi + / 3 = we note that although the system above is consistent it still 
fails to uniquely determine the four components of x. In particular, it specifies only the difference 
between x\ and x 3 . As a result both 



x 



MM 


\/ 4 /V 



and x 



( 

h/kx 
~fi/k 2 



satisfy Sx = f. In fact, one may add to either an arbitrary multiple of 

AN 



1 

V / 



z = 



(2.10) 



and still have a solution of Sx = f. Searching for the source of this lack of uniqueness we observe 
some redundancies in the columns of S. In particular, the third is simply the opposite of the first. 
As S is simply A T KA we recognize that the original fault lies with A, where again, the first and 
third columns are opposites. These redundancies are encoded in z in the sense that 

Az = 0. 



Interpreting this in mechanical terms, we view z as a displacement and Az as the resulting elonga- 
tion. In Az = we see a nonzero displacement producing zero elongation. One says in this case 
that the truss deforms without doing any work and speaks of z as an 'unstable mode.' Again, this 
mode could have been observed by a simple glance at Figure 2.2. Such is not the case for more 
complex structures and so the engineer seeks a systematic means by which all unstable modes may 
be identified. We shall see in Chapter 3 that these modes are captured by the null space of A. 

From Sz = one easily deduces that S is singular. More precisely, if S~ l were to exist then S^^-Sz 
would equal S' _1 0, i.e., z — 0, contrary to (2.10). As a result, Matlab will fail to solve Sx = f 
even when / is a force that the truss can equilibrate. One way out is to use the pseudo-inverse, as 
we shall see below. 
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2.3. A Large Planar Network 

We close with the (scalable) example of the larger planar net in Figure 2.3. Elastic fibers, 
numbered 1-20, meet at nodes, numbered 1-9. We limit our observation to the motion of the 
nodes by denoting the horizontal and vertical displacements of node j by a?2j-i and x 2 j respectively. 
Retaining the convention that down and right are positive we note that the elongation of fiber 1 is 

e 1 = x 2 - Xg 

while that of fiber 3 is 

es = x 3 - xi. 




Figure 2.3. A crude tissue model. 
As fibers 2 and 4 are neither vertical nor horizontal their elongations, in terms of nodal displace- 
ments, are not so easy to read off. This is more a nuisance than an obstacle however, for recalling 
our earlier discussion, the elongation is approximately just the stretch along its undeformed axis. 
With respect to fiber 2, as it makes the angle — 7r/4 with respect to the positive horizontal axis, we 
find 

e 2 = (xg - Xi) cos(-7r/4) + (x l0 - x 2 ) sin(— 7r/4) = (x 9 - x 1 + x 2 - x l0 )/V2. 

Similarly, as fiber 4 makes the angle — 37r/4 with respect to the positive horizontal axis, its elongation 
is 

e4 = (x 7 — X3) cos(— 37r/4) + (x s — X4) sin(— 37r/4) = (x 3 — x 7 + rr 4 — x$)/V2. 
These are both direct applications of the general formula 

ej = (x 2n -i ~ x 2m -i) cos(6» i ) + (x 2n - x 2m ) sin(^) (2.11) 

for fiber j, as depicted in the figure below, connecting node m to node n and making the angle 9j 
with the positive horizontal axis when node m is assumed to lie at the point (0,0). The reader 
should check that our expressions for e\ and indeed conform to this general formula and that e 2 
and e 4 agree with ones intuition. For example, visual inspection of the specimen suggests that fiber 
2 can not be supposed to stretch (i.e., have positive e 2 ) unless x$ > x\ and/or x 2 > x±o. Does this 
jibe with (2.11)? 
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Figure 2.4. Elongation of a generic bar, see (2.11). 



Applying (2.11) to each of the remaining fibers we arrive at e = Ax where A is 20-by-18, one 
row for each fiber, and one column for each degree of freedom. For systems of such size with such a 
well defined structure one naturally hopes to automate the construction. We have done just that in 
the accompanying M-file and diary. The M-file begins with a matrix of raw data that anyone with 
a protractor could have keyed in directly from Figure 2.3. More precisely, the data matrix has a 
row for each fiber and each row consists of the starting and ending node numbers and the angle the 
fiber makes with the positive horizontal axis. This data is precisely what (2.11) requires in order 
to know which columns of A receive the proper cos or sin. The final A matrix is displayed in the 
diary. 

The next two steps are now familiar. If K denotes the diagonal matrix of fiber stiffnesses and 
/ denotes the vector of nodal forces then y = Ke and A T y = f and so one must solve Sx = f 
where S = A T KA. In this case there is an entire three-dimensional class of z for which Az = 
and therefore Sz = 0. The three indicates that there are three independent unstable modes of the 
specimen, e.g., two translations and a rotation. As a result S is singular and x = S\f in Matlab 
will get us nowhere. The way out is to recognize that S has 18 — 3 = 15 stable modes and that if 
we restrict S to 'act' only in these directions then it 'should' be invertible. We will begin to make 
these notions precise in Chapter 4 on the Fundamental Theorem of Linear Algebra. 



Figure 2.5. The solid(dashed) circles correspond to the nodal positions before(after) the application 
of the traction force, /. 

For now let us note that every matrix possesses such a pseudo-inverse and that it may be 
computed in Matlab via the pinv command. On supposing the fiber stiffnesses to each be one 
and the edge traction to be of the form 



we arrive at x via x=pinv (S) *f and refer to Figure 2.5 for its graphical representation. 
2.4. Exercises 

1. With regard to Figure 2.1, (i) Derive the A and K matrices resulting from the removal of the 
fourth spring (but not the third mass) and assemble S = A 7 KA. 



C Q Q © 



GOO 



/ = [-1 1 1 1 1 -100010 -1 -10 -11 - 1] T , 
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(ii) Compute S* -1 , by hand via Gauss-Jordan, and compute L and U where S = LU by hand 
via the composition of elimination matrices and their inverses. Assume throughout that with 
h = k 2 = h = k, 

(iii) Use the result of (ii) with the load / = [0 F] T to solve Sx = f by hand two ways, i.e., 
x = S^ 1 / and Lc = f and Ux = c. 

2. With regard to Figure 2.2 

(i) Derive the A and K matrices resulting from the addition of a fourth (diagonal) fiber that 
runs from the top of fiber one to the second mass and assemble S = A T KA. 

(ii) Compute S* -1 , by hand via Gauss-Jordan, and compute L and U where S = LU by hand 
via the composition of elimination matrices and their inverses. Assume throughout that with 
k\ = k 2 = k 3 = A; 4 = k. 

(iii) Use the result of (ii) with the load / = [0 F 0] T to solve Sx = f by hand two ways, i.e., 
x = S^ 1 / and Lc = f and Ux = c. 

3. Generalize figure 2.3 to the case of 16 nodes connected by 42 fibers. Introduce one stiff 
(say k = 100) fiber and show how to detect it by 'properly' choosing /. Submit your well- 
documented M-file as well as the plots, similar to Figure 2.5, from which you conclude the 
presence of a stiff fiber. 

4. We generalize Figure 2.3 to permit ever finer meshes. In particular, with reference to the figure 
below we assume iV(iV — 1) nodes where the horizontal and vertical fibers each have length 
1/N while the diagonal fibers have length \/2/N . The top row of fibers is anchored to the 
ceiling. 




(i) Write and test a MATLAB function S=bignet (N) that accepts the odd number N and 
produces the stiffness matrix S = A T KA. As a check on your work we offer a spy plot of A 
when N = 5. Your K matrix should reflect the fiber lengths as spelled out in (2.1). You may 
assume Yjdj = 1 for each fiber. The sparsity of A also produces a sparse S. In order to exploit 
this, please use S=sparse (S) as the final line in bignet . m. 
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spy(A) when N=5 
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(ii) Write and test a driver called bigrun that generates 5* for N = 5 : 4 : 29 and for each N 
solves Sx = f two ways for 100 choices of /. In particular, / is a steady downward pull on 
the bottom set of nodes, with a continual increase on the pull at the center node. This can be 
done via f=zeros (size (S, 1) , 1) ; f(2:2:2*N) = le-3/N; 
for j=l:100, 

f(N+l) = f(N+l) + le-4/N; 

This construction should be repeated twice, with the code that closes §2.1 as your guide. In 
the first scenario, precompute S 1 " 1 via inv and then apply x = S~ 1 f in the j loop. In the 
second scenario precompute L and U and then apply x = U\(L\f) in the j loop. In both 
cases use tic and toe to time each for loop and so produce a graph of the form 
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Submit your well documented code, a spy plot of S when iV = 9, and a time comparison like 
(will vary with memory and cpu) that shown above. 
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3. The Column and Null Spaces 



3.1. The Column Space 



We begin with the direct geometric interpretation of matrix-vector multiplication. Namely, the 
multiplication of the n-by-1 vector x by the m-bj-n matrix S produces a linear combination of the 
columns of S. More precisely, if Sj denotes the jth column of S, then 



Sx = [si s 2 



x 2 

\XnJ 



XiSi + x 2 s 2 H h x n s r 



(3.1) 



The picture I wish to place in your mind's eye is that Sx lies in the plane spanned by the columns 
of S. This plane occurs so frequently that we find it useful to distinguish it with a 

Definition 3.1. The column space of the m-by-n matrix S is simply the span of its columns, 
i.e., 

TZ{S) = {Sx:xe W 1 }. 
This is a subset of M m . The letter 1Z stands for range. 

For example, let us recall the S matrix associated with Figure 2.2. Its column space is 
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As the first and third columns are colinear we may write 



TZ(S) 



X\ 
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As the remaining three columns are linearly independent we may go no further. We 'recognize' 
then 1Z(S) as a three dimensional subspace of M 4 . In order to use these ideas with any confidence 
we must establish careful definitions of subspace, independence, and dimension. 

A subspace is a natural generalization of line and plane. Namely, it is any set that is closed 
under vector addition and scalar multiplication. More precisely, 

Definition 3.2. A subset M of W 1 is a subspace of M d when 

(1) p + q G M whenever p G M and q G M, and 

(2) tp G M whenever p G M and tel. 

Let us confirm now that TZ(S) is indeed a subspace. Regarding (1) if p G 7Z(S) and q G TZ{S) 
then p = Sx and q = Sy for some x and y. Hence, p + q = Sx + Sy = S(x + y), i.e., (p + q) G 7Z(S). 
With respect to (2), tp = tSx = S(tx) so tp G K(S). 

This establishes that every column space is a subspace. The converse is also true. Every subspace 
is the column space of some matrix. To make sense of this we should more carefully explain what 
we mean by 'span'. 
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Definition 3.3. A collection of vectors {s\, s 2 , . . . , s n } in a subspace M is said to span M when 
M = 7Z(S) where S — [si s 2 ■ ■ ■ s n ]. 

We shall be interested in how a subspace is 'situated' in its ambient space. We shall have occasion 
to speak of complementary subspaces and even the sum of two subspaces. Lets take care of the 
latter right now, 

Definition 3.4. If M and Q are subspaces of the same ambient space, R d , we define their direct 
sum 

M © Q = {p + q : p G M and q G Q} 
as the union of all possible sums of vectors from M and Q. 

Do you see how R 3 may be written as the direct sum of R 1 and R 2 ? 

3.2. The Null Space 

Definition 3.5. The null space of an m-by-n matrix S is the collection of those vectors in R n 
that S maps to the zero vector in R m . More precisely, 

Af(S) = {x G R n : Sx = 0}. 

Let us confirm that Af(S) is in fact a subspace. If both x and y lie in Af(S) then Sx = Sy = 
and so S(x + y) — 0. In addition, S(tx) = tSx = for every t G R. 

As an example we remark that the null space of the S matrix associated with Figure 2.2 is 



/1\ 

1 

V / 



t G 



a line in R 4 . 

The null space answers the question of uniqueness of solutions to Sx = f. For, if Sx = f and 
Sy = f then S(x — y) = Sx — Sy = f — f = and so (x — y) G Af(S). Hence, a solution to Sx = f 
will be unique if, and only if, J\f(S) = {0}. 

Recalling (3.1) we note that if x G Af(S) and i^O, say, e.g., x\ ^ 0, then Sx = takes the 
form 

n 

3=2 Xl 

That is, the first column of S may be expressed as a linear combination of the remaining columns 
of S. Hence, one may determine the (in) dependence of a set of vectors by examining the null space 
of the matrix whose columns are the vectors in question. 

Definition 3.6. The vectors {s 1: s 2 , • • • , s n } are said to be linearly independent if J\f(S) = {0} 
where S — [s± s 2 • • • s n ] . 

As lines and planes are described as the set of linear combinations of one or two generators, so 
too subspaces are most conveniently described as the span of a few basis vectors. 

Definition 3.7. A collection of vectors {s±, s 2 , . . . , s n } in a subspace M is a basis for M when the 
matrix S — [si s 2 • • • s n ] satisfies 
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(1) M = TZ(S), and 

(2) Af(S) = {0}. 

The first stipulates that the columns of S span M while the second requires the columns of S to 
be linearly independent. 

3.3. A Blend of Theory and Example 

Let us compute bases for the null and column spaces of the adjacency matrix associated with 
the ladder below 



4 



Figure 3.1. An unstable ladder? 

The ladder has 8 bars and 4 nodes, so 8 degrees of freedom. Continuing to denote the horizontal 
and vertical displacements of node j by a^-i and X2j we arrive at the A matrix 
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To determine a basis for 1Z(A) we must find a way to discard its dependent columns. A moment's 
reflection reveals that columns 2 and 6 are colinear, as are columns 4 and 8. We seek, of course, 
a more systematic means of uncovering these, and perhaps other less obvious, dependencies. Such 
dependencies are more easily discerned from the row reduced form 

\ 
0-10 



1 -1 
10 
1 

/ 

Recall that rref performs the elementary row operations necessary to eliminate all nonzeros below 
the diagonal. For those who can't stand to miss any of the action I recommend rrefmovie. 

Each nonzero row of A re( j is called a pivot row. The first nonzero in each row of A rcc j is called 
a pivot. Each column that contains a pivot is called a pivot column. On account of the staircase 
nature of A re <i we find that there are as many pivot columns as there are pivot rows. In our example 
there are six of each and, again on account of the staircase nature, the pivot columns are the linearly 



A rcd = rref (A) = 
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independent columns of A m< ±. One now asks how this might help us distinguish the independent 
columns of A. For, although the rows of A re d are linear combinations of the rows of A no such thing 
is true with respect to the columns. The answer is: pay attention only to the indices of the pivot 
columns. In our example, columns {1,2,3,4,5,7} are the pivot columns. In general 

Proposition 3.1. Suppose A is m-by-n. If columns {cj : j = 1, . . . ,r} are the pivot columns of 
A re d then columns {cj : j — 1, . . . , r} of A constitute a basis for 71(A). 

Proof: Note that the pivot columns of A rc a are, by construction, linearly independent. Suppose, 
however, that columns {cj : j — 1, . . . , r} of A are linearly dependent. In this case there exists a 
nonzero iGl" for which Ax = and 

x k = 0, k#{ Cj :j = l,...,r}. (3.2) 

Now Ax = necessarily implies that A TC< ix = 0, contrary to the fact that columns {cj : j = 1, . . . , r} 
are the pivot columns of A re(i . (The implication Ax = =>- A Te(i x = follows from the fact that we 
may read row reduction as a sequence of linear transformations of A. If we denote the product of 
these transformations by T then T A = A rcd and you see why Ax = =>- A red x = 0. The reverse 
implication follows from the fact that each of our row operations is reversible, or, in the language 
of the land, invert ible.) 

We now show that the span of columns {cj : j = 1, . . . ,r} of A indeed coincides with 71(A). 
This is obvious if r = n, i.e., if all of the columns are linearly independent. If r < n there exists a 
q {cj : j — 1, . . . ,r}. Looking back at A re <i we note that its qth column is a linear combination 
of the pivot columns with indices not exceeding q. Hence, there exists an x satisfying (3.2) and 
A Ic dX = and x q — 1. This x then necessarily satisfies Ax = 0. This states that the qth column of 
A is a linear combination of columns {cj : j = 1, . . . , r} of A. End of Proof. 

Let us now exhibit a basis for Af(A). We exploit the already mentioned fact that Af(A) = 
J\f(A md ). Regarding the latter, we partition the elements of x into so called pivot variables, 

{x Cj : j = l,...,r} 

and free variables 

{ x k ■ k {cj : j = 1,.. . ,r}}. 
There are evidently n — r free variables. For convenience, let us denote these in the future by 

{x Cj :j = r + l,...,n}. 

One solves A vedL x = by expressing each of the pivot variables in terms of the nonpivot, or free, 
variables. In the example above, x±, X2, x^, X4, x$ and X7 are pivot while xq and x$ are free. Solving 
for the pivot in terms of the free we find 

x 7 = 0, x 5 = 0, £ 4 = x%, x 3 = 0, x 2 = x 6 , xi = 0, 

or, written as a vector, 
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where Xq and x$ are free. As Xq and x$ range over all real numbers the x above traces out a plane 
in M 8 . This plane is precisely the null space of A and (3.3) describes a generic element as the 
linear combination of two basis vectors. Compare this to what Matlab returns when faced with 
null (A, ' r' ) . Abstracting these calculations we arrive at 

Proposition 3.2. Suppose that A is m-hy-n with pivot indices {cj : j — 1, . . . , r} and free indices 
{cj : j — r + 1, . . . , n}. A basis for Af(A) may be constructed of n — r vectors {zi, z 2 , ■ ■ ■ , z n _ r } 
where Zk, and only z k , possesses a nonzero in its c r+ k component. 

With respect to our ladder the free indices are cj = 6 and c$ = 8. You still may be wondering what 
71(A) and Af(A) tell us about the ladder that did not already know. Regarding 7Z(A) the answer 
will come in the next chapter. The null space calculation however has revealed two independent 
motions against which the ladder does no work! Do you see that the two vectors in (3.3) encode 
rigid vertical motions of bars 4 and 5 respectively? As each of these lies in the null space of A 
the associated elongation is zero. Can you square this with the ladder as pictured in figure 3.1? I 
hope not, for vertical motion of bar 4 must 'stretch' bars 1,2,6 and 7. How does one resolve this 
(apparent) contradiction? 

We close a few more examples. We compute bases for the column and null spaces of 



hence both rows are pivot rows and columns 1 and 2 are pivot columns. Proposition 3.1 then 
informs us that the first two columns of A, namely 



comprise a basis for 1Z(A). In this case, TZ(A) = M 2 . 

Regarding M(A) we express each row of A TCt ^x = as the respective pivot variable in terms of 
the free. More precisely, x\ and rr 2 are pivot variables and x^ is free and A^x = reads 




Subtracting the first row from the second lands us at 





(3.4) 



xi + x 2 = 
—x 2 + x 3 = 



Working from the bottom up we find 



x 2 = x 3 and X\ = —Xs 



and hence every vector in the null space is of the form 




In other words 
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and 




constitutes a basis for Af(A). 

Let us stretch this example a bit and see what happens. In particular, we append a new column 
and arrive at 

1 1 2 N 
10 13 



B = 



The column space of A was already the 'whole' space and so adding a column changes, with respect 
to 11(A), nothing. That is, 11(B) = 11(A) and (3.4) is a basis for 11(B). 
Regarding M(B) we again subtract the first row from the second, 
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red 
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and identify x x and x 2 as pivot variables and x 3 and x 4 as free. We see that B red x = means 

x 1 + x 2 + 2x4 = 



or, equivalently, 
and so 



-x 2 + x 3 + x 4 = 



X2 = x 3 + £4 and x\ = —x 3 — 3x 4 
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constitutes a basis for J\f(B). 

The number of pivots, r, of an m-bj-n matrix A appears to be an important indicator. We shall 
refer to it from now on as the rank of A. Our canonical bases for 1Z(A) and M(A) possess r and 
n — r elements respectively. The number of elements in a basis for a subspace is typically called the 
dimension of the subspace. 



3.4. Exercises 



1. Which of the following subsets of R 3 are actually subspaces? Check both conditions in defini- 
tion 2 and show your work. 

(a) All vectors whose first component Xi = 0. 

(b) All vectors whose first component x± — 1. 
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(c) All vectors whose first two components obey X\X 2 = 0. 

(d) The vector (0,0,0). 

(e) All linear combinations of the pair (1,1,0) and (2, 0, 1). 

(f) All vectors for which x^ — x 2 + Sx\ = 0. 

2. I encourage you to use rref and null for the following, (i) Add a diagonal crossbar between 
nodes 3 and 2 in Figure 3.1 and compute bases for the column and null spaces of the new 
adjacency matrix. As this crossbar fails to stabilize the ladder we shall add one more bar. (ii) 
To the 9 bar ladder of (i) add a diagonal cross bar between nodes 1 and the left end of bar 6. 
Compute bases for the column and null spaces of the new adjacency matrix. 

3. We wish to show that M(A) = M(A T A) regardless of A. 

(i) We first take a concrete example. Report the findings of null when applied to A and A T A 
for the A matrix associated with Figure 3.1. 

(ii) For arbitrary A show that N(A) C J\f(A T A), i.e., that if Ax = then A T Ax = 0. 

(iii) For arbitrary A show that M(A T A) C N(A), i.e., that if A T Ax = then Ax = 0. (Hint: 
if A 7 'Ax = then x T A T Ax = and says something about \\Ax\\, recall that \\y\\ 2 = y T y.) 

4. Suppose that A is m-hy-n and that Af(A) = W n . Argue that A must be the zero matrix. 

5. Suppose that both {si, . . . , s n } and {ti, . . . ,t m } are both bases for the subspace M. Prove 
that m = n and hence that our notion of dimension makes sense. 
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4. The Fundamental Theorem of Linear Algebra 



The previous chapter, in a sense, only told half of the story. In particular, an m-by-n matrix A 
maps R n into W m and its null space lies in R™ and its column space lies in R m . Having seen examples 
where TZ{A) was a proper subspace of M m one naturally asks about what is left out. Similarly, one 
wonders about the subspace of M. n that is complimentary to M{A). These questions are answered 
by the column space and null space of A T . 

4.1. The Row Space 

As the columns of A are simply the rows of A we call TZ(A T ) the row space of A. More precisely 
Definition 4.1. The row space of the m-by-n matrix A is simply the span of its rows, i.e., 



This is a subspace of M. n . 

Regarding a basis for TZ(A T ) we recall that the rows of A^ =rref (A) are merely linear 
combinations of the rows of A and hence 



Recalling that pivot rows of A rc a are linearly independent and that all remaining rows of A re d are 
zero leads us to 

Proposition 4.1. Suppose A is m-by-n. The pivot rows of A rcd constitute a basis for 1Z(A T ). 

As there are r pivot rows of A md we find that the dimension of 1Z(A T ) is r. Recalling Proposition 
2.2 we find the dimensions of N(A) and 1Z(A T ) to be complementary, i.e., they sum to the dimension 
of the ambient space, n. Much more in fact is true. Let us compute the dot product of an arbitrary 
element x G 1Z(A T ) and z e N(A). As x = A T y for some y we find 



This states that every vector in 1Z(A T ) is perpendicular to every vector in M{A). 

Let us test this observation on the A matrix stemming from the unstable ladder of §3.4. Recall 
that 



TZ(A T ) = {A T y : y E R m }. 



n(A T )=n((A rcd ) T ). 



x T z = {A T y) T z = y T Az = 0. 



Zl = 



1 



1 




and z 2 
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constitute a basis for M(A) while the pivot rows of A red are 
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Indeed, each Zj is perpendicular to each Xk- As a result, 

{Zi, Z 2 , X!,X 2 , X 3 , X 4 , X 5 , X 6 } 

comprises a set of 8 linearly independent vectors in R 8 . These vectors then necessarily span M 8 . 
For, if they did not, there would exist nine linearly independent vectors in M 8 ! In general, we find 

Fundamental Theorem of Linear Algebra (Preliminary). Suppose A is m-by-n and has 
rank r. The row space, 1Z(A T ), and the null space, N(A), are respectively r and n — r dimensional 
subspaces of lR n . Each i£l" may be uniquely expressed in the form 

x = xr + xn, where xr G 1Z(A t ) and xn G N(A). (4.1) 

Recalling Definition 4 from Chapter 3 we may interpret (4.1) as 

W 1 = K(A T )@N(A). 

As the constituent subspaces have been shown to be orthogonal we speak of W l as the orthogonal 
direct sum of K(A T ) and M(A). 

4.2. The Left Null Space 

The Fundamental Theorem will more than likely say that M. m = TZ(A) ®J\f(A T ). In fact, this 
is already in the preliminary version. To coax it out we realize that there was nothing special 
about the choice of letters used. Hence, if B is p-by-g then the preliminary version states that 
W = TZ(B T ) © M{B). As a result, letting B = A T , p = n and q = m, we find indeed R m = 
1Z(A) (BAf(A T ). That is, the left null space, Af(A T ), is the orthogonal complement of the column 
space, 71(A). The word 'left' stems from the fact that A T y = is equivalent to y T A = 0, where y 
'acts' on A from the left. 
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In order to compute a basis for M{A T ) we merely mimic the construction of the previous section. 
Namely, we compute (A T ) rc( j and then solve for the pivot variables in terms of the free ones. 
With respect to the A matrix associated with the unstable ladder of §3.4, we find 
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We recognize the rank of A T to be 6, with pivot and free indices 

{1,2,4,5,6,7} and {3,8} 
respectively. Solving (A T ) TC ,iX = for the pivot variables in terms of the free we find 

X 7 = X%, Xq = X 8 , X 5 = 0, X 4 = 0, X 2 = X 3 , Xi = x 3 , 

or in vector form, 



x 



%3 
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+ x 8 
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These two vectors constitute a basis for M(A T ) and indeed they are both orthogonal to every column 
of A. We have now exhibited means by which one may assemble bases for the four fundamental 
subspaces. In the process we have established 

Fundamental Theorem of Linear Algebra. Suppose A is m-by-n and has rank r. One has the 
orthogonal direct sums 



= TZ{A T )®M{A) and R m = K(A) © ^(A 1 
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where the dimensions are 

dimTl(A) = dim7l(A T ) = r 

dimA/" (A) = n — r 
dimA/" (A T ) = m — r. 

We shall see many applications of this fundamental theorem. Perhaps one of the most common 
is the use of the orthogonality of 71(A) and Af(A T ) in the characterization of those b for which an 
x exists for which Ax = b. There are many instances for which 71(A) is quite large and unwieldy 
while M(A T ) is small and therefore simpler to grasp. As an example, consider the (n — l)-by-n 
'first order difference' matrix with —1 on the diagonal and 1 on the super diagonal, 

/-l 1 -\ 
0-110 

A = ' 

V • • -1 I) 

It is not difficult to see that M(A T ) = {0} and so, 71(A), being its orthogonal complement, is the 
entire space, R™^ 1 . That is, for each b e M n_1 there exists an x G W 1 such that Ax = b. The 
uniqueness of such an x is decided by Af(A). We recognize this latter space as the span of the 
vector of ones. But you already knew that adding a constant to a function does not change its 
derivative. 

4.3. Exercises 

1. True or false: support your answer. 

(i) If A is square then 71(A) = 7Z(A T ). 

(ii) If A and B have the same four fundamental subspaces then A—B. 

2. Construct bases (by hand) for the four subspaces associated with 

'1 1 -1 N 
1 -1 

Also provide a careful sketch of these subspaces. 

3. Show that if AB = then 71(B) C N(A). 

4. Why is there no matrix whose row space and null space both contain the vector [1 1 1] T ? 

5. Write down a matrix with the required property or explain why no such matrix exists. 

(a) Column space contains [1 0] T and [0 1] T while row space contains [1 1] T and [1 2] T . 

(b) Column space has basis [1 1 1] T while null space has basis [1 2 1] T . 

(c) Column space is M 4 while row space is M 3 . 

6. One often constructs matrices via outer products, e.g., given a n-by-1 vector v let us consider 
A = vv T . 

(a) Show that v is a basis for 71(A), 

(b) Show that Af(A) coincides with all vectors perpendicular to v. 

(c) What is the rank of A? 
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5. Least Squares 

We learned in the previous chapter that Ax = b need not possess a solution when the number of 
rows of A exceeds its rank, i.e., r < m. As this situation arises quite often in practice, typically in 
the guise of 'more equations than unknowns,' we establish a rationale for the absurdity Ax = b. 

5.1. The Normal Equations 

The goal is to choose x such that Ax is as close as possible to b. Measuring closeness in terms 
of the sum of the squares of the components we arrive at the 'least squares' problem of minimizing 

\\Ax-b\\ 2 = (Ax-b) T (Ax-b) (5.1) 

over all i G K" The path to the solution is illuminated by the Fundamental Theorem. More 
precisely, we write 

b = b R + b N where b R e 71(A) and b N G Af(A T ). 

On noting that (i) (Ax — b R ) G 71(A) for every x G W l and (ii) 71(A) _L M(A T ) we arrive at the 
Pythagorean Theorem 

\\Ax - b\\ 2 = \\Ax -b R - b N f = \\Ax - b R \\ 2 + ||6jv|| 2 , (5.2) 

It is now clear from (5.2) that the best x is the one that satisfies 

Ax = b R . (5.3) 

As b R G 71(A) this equation indeed possesses a solution. We have yet however to specify how one 
computes b R given b. Although an explicit expression for b R , the so called orthogonal projection 
of b onto 71(A), in terms of A and b is within our grasp we shall, strictly speaking, not require it. 
To see this, let us note that if x satisfies (5.3) then 

Ax - b = Ax - b R - b N = -b N . (5.4) 

As b^ is no more easily computed than b R you may claim that we are just going in circles. The 
'practical' information in (5.4) however is that (Ax — b) G Af(A T ), i.e., A T (Ax — b) = 0, i.e., 

A T Ax = A T b. (5.5) 

As A T b G 7l(A T ) regardless of b this system, often referred to as the normal equations, indeed 
has a solution. This solution is unique so long as the columns of A T A are linearly independent, 
i.e., so long as M(A T A) = {0}. Recalling Chapter 2, Exercise 2, we note that this is equivalent to 
M(A) = {0}. We summarize our findings in 

Proposition 5.1. The set of x G lR n for which the misfit \\Ax — b\\ 2 is smallest is composed of 
those x for which 

A T Ax = A T b. 

There is always at least one such x. 

There is exactly one such x iff Af(A) = {0}. 

As a concrete example, suppose with reference to the figure below that 



A = 1 and b 
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2 -2 



Figure 5.1. The decomposition of b. 

As b 7^ 71(A) there is no x such that Ax = b. Indeed, 

\\Ax - b\\ 2 = (xi + x 2 - l) 2 + (x 2 - l) 2 
with the minimum uniquely attained at 



1 > 1, 



x 



in agreement with the unique solution of (5.5), for 



A A 



We now recognize, a posteriori, that 



1 1 
1 2 



and A T b 



bn = Ax 



is the orthogonal projection of b onto the column space of A. 

5.2. Applying Least Squares to the Biaxial Test Problem 

We shall formulate the identification of the 20 fiber stiffnesses in Figure 2.3, as a least squares 
problem. We envision loading, /, the 9 nodes and measuring the associated 18 displacements, x. 
From knowledge of x and / we wish to infer the components of K = diag(A;) where k is the vector 
of unknown fiber stiffnesses. The first step is to recognize that 



may be written as 



A T KAx = f 



Bk = f where B = A T diag(Ax). 



(5.6) 
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Though conceptually simple this is not of great use in practice, for B is 18-by-20 and hence (5.6) 
possesses many solutions. The way out is to compute k as the result of more than one experiment. 
We shall see that, for our small sample, 2 experiments will suffice. 

To be precise, we suppose that x^ is the displacement produced by loading /W while x^ is the 
displacement produced by loading f^ 2 \ We then piggyback the associated pieces in 



B 



A T d^g{Ax^] 
A T d±3.g(Ax^\ 



and / 



;(2) 



This B is 36-by-20 and so the system Bk = f is overdetermined and hence ripe for least squares. 

We proceed then to assemble B and /. We suppose f^ 1 ' and to correspond to horizontal 
and vertical stretching 

fW = [-1 1 -1 1 -1 1 0] T 
f {2) = [0 101010000000 -1 -1 -1] T 



respectively. For the purpose of our example we suppose that each kj 
assemble A T KA as in Chapter 2 and solve 



1 except k$ = 5. We 



A T KAx {j) = f 



Li) 



with the help of the pseudoinverse. In order to impart some 'reality' to this problem we taint 
each x^> with 10 percent noise prior to constructing B. Please see the attached M-file for details. 
Regarding 

B T Bk = B T f 

we note that Matlab solves this system when presented with k=B\f when B is rectangular. We 
have plotted the results of this procedure in the figure below 




in 
w 
CD 

c 



CD 




fiber number 

Figure 5.2. Results of a successful biaxial test. 
The stiff fiber is readily identified. 
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5.3. Projections 

From an algebraic point of view (5.5) is an elegant reformulation of the least squares problem. 
Though easy to remember it unfortunately obscures the geometric content, suggested by the word 
'projection,' of (5.4). As projections arise frequently in many applications we pause here to develop 
them more carefully. 

With respect to the normal equations we note that if M(A) = {0} then 

x = (A T A)- 1 A T b 
and so the orthogonal projection of b onto 71(A) is 

b R = Ax = A(A T A)- 1 A T b. (5.7) 

Defining 

P = A(A T A)~ 1 A T , (5.8) 

(5.7) takes the form b R = Pb. Commensurate with our notion of what a 'projection' should be we 
expect that P map vectors not in 71(A) onto 71(A) while leaving vectors already in 71(A) unscathed. 
More succinctly, we expect that Pb R = b R , i.e., PPb = Pb. As the latter should hold for all b £ M. m 
we expect that 

P 2 = P. (5.9) 

With respect to (5.8) we find that indeed 

P 2 = A(A T A)- 1 A T A(A T A)- 1 A T = A(A T A)~ 1 A T = P. 

We also note that the P in (5.8) is symmetric. We dignify these properties through 

Definition 5.1. A matrix P that satisfies P 2 = P is called a projection. A symmetric projection 
is called an orthogonal projection. 

We have taken some pains to motivate the use of the word 'projection.' You may be wondering 
however what symmetry has to do with orthogonality. We explain this in terms of the tautology 

b = Pb + (I - P)b. 

Now, if P is a projection then so too is (I — P). Moreover, if P is symmetric then the dot product 
of 6's two constituents is 

(Pb) T (I - P)b = b T P T {I - P)b = b T (P - P 2 )b = b T 0b = 0, 

i.e., Pb is orthogonal to (/ — P)b. 

As examples of nonorthogonal projections we offer 

(] n) and (-1/2 0] 
V 1 U / \-l/4 -1/2 lj 

Finally, let us note that the central formula, P = A(A T A)^ 1 A T , is even a bit more general than 
advertised. It has been billed as the orthogonal projection onto the column space of A. The need 
often arises however for the orthogonal projection onto some arbitrary subspace M. The key to 
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using the old P is simply to realize that every subspace is the column space of some matrix. More 
precisely, if 



is a basis for M then clearly if these Xj are placed into the columns of a matrix called A then 
71(A) = M. For example, if M is the line through [1 1] T then 



is orthogonal projection onto M. 
5.4. Exercises 

1. A steal beam was stretched to lengths £ = 6, 7, and 8 feet under applied forces of / = 1, 2, 
and 4 tons. Assuming Hooke's law £ — L = cf, find its compliance, c, and original length, L, 
by least squares. 

2. With regard to the example of §5.3 note that, due to the the random generation of the noise 
that taints the displacements, one gets a different 'answer' every time the code is invoked. 

(i) Write a loop that invokes the code a statistically significant number of times and submit 
bar plots of the average fiber stiffness and its standard deviation for each fiber, along with the 
associated M-file. 

(ii) Experiment with various noise levels with the goal of determining the level above which it 
becomes difficult to discern the stiff fiber. Carefully explain your findings. 

3. Find the matrix that projects M 3 onto the line spanned by [1 1] T . 

4. Find the matrix that projects M 3 onto the plane spanned by [1 1] T and [1 1 — 1] T . 

5. If P is the projection of M m onto a /c-dimensional subspace M, what is the rank of P and what 





is TZ(P)? 
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6. Matrix Methods for Dynamical Systems 



Up to this point we have largely been concerned with (i) deriving linear systems of algebraic equa- 
tions (from considerations of static equilibrium) and (ii) the solution of such systems via Gaussian 
elimination. 

In this section we hope to begin to persuade the reader that our tools extend in a natural fashion 
to the class of dynamic processes. More precisely, we shall argue that (i) Matrix Algebra plays a 
central role in the derivation of mathematical models of dynamical systems and that, with the aid 
of the Laplace transform in an analytical setting or the Backward Euler method in the numerical 
setting, (ii) Gaussian elimination indeed produces the solution. 

6.1. Neurons and the Dynamic Strang Quartet 

A nerve fiber's natural electrical stimulus is not direct current but rather a short burst of current, 
the so-called nervous impulse. In such a dynamic environment the cell's membrane behaves not 
only like a leaky conductor but also like a charge separator, or capacitor. 
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Figure 6.1. An RC model of a neuron. 
The typical value of a cell's membrane capacitance is 

c = 1 {fiF/cm 2 ) 

where fiF denotes micro-Farad. The capacitance of a single compartment is therefore 

C m = 2ita(t/N)c 

and runs parallel to each R m , see figure 6.1. We ask now how the static Strang Quartet of chapter 
one should be augmented. Regarding (SI') we proceed as before. The voltage drops are 



and so 
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In (S2) we must now augment Ohm's law with voltage-current law obeyed by a capacitor, namely 
- the current through a capacitor is proportional to the time rate of change of the potential across 
it. This yields, (denoting d/dt by '), 

Vi = C m e[, y2 = e 2 /R m , 2/3 = e 3 /-Rj, 2/4 = C m e' 4 , 
y 5 = e 5 /R m , y 6 = e 6 /Ri, y 7 = C m e' 7 , y 8 = e 8 /R m , 

or, in matrix terms, 

y = Ge + Ce' 

where 

G = diag(0 G m Gi G rn Gi G m ) 
C = diag(C m C m C m 0) 

are the conductance and capacitance matrices. 

As Kirchhoff's Current law is insensitive to the type of device occupying an edge, step (S3) 
proceeds exactly as above. 

io ~ Vi ~ V2 - 2/3 = 1/3-2/4-2/5-2/6 = 2/6 - 2/7 - 2/8 = 0, 
or, in matrix terms, 

A T y = -f where /=[ lo OO] T 
Step (S4) remains one of assembling, 

A T y = -/=>► A T (Ge + Ce') = -f A T {G{b - Ax) + C(b' - Ax')) = -f, 

becomes 



AC Ax' + AG Ax = A Gb + f + A Cb' . (6.1) 



This is the general form of the potential equations for an RC circuit. It presumes of the user 
knowledge of the initial value of each of the potentials, 

x(0) = X. (6.2) 

Regarding the circuit of figure 6.1 we find 
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A T Gb = E m I G m I and A T Cb' 
and an initial (rest) potential of 

x(0) = E m [l 1 1] T . 

We shall now outline two modes of attack on such problems. The Laplace Transform is an analytical 
tool that produces exact, closed-form, solutions for small tractable systems and therefore offers 
insight into how larger systems 'should' behave. The Backward-Euler method is a technique for 
solving a discretized (and therefore approximate) version of (6.1). It is highly flexible, easy to code, 
and works on problems of great size. Both the Backward-Euler and Laplace Transform methods 



37 



require, at their core, the algebraic solution of a linear system of equations. In deriving these 
methods we shall find it more convenient to proceed from the generic system 



x' = Bx + g. (6.3) 

With respect to our fiber problem 

B = -(A T CA)- 1 A T GA 

I f—(Gi + G m ) Gi \ (6 4) 

Gi —(2Gi + G rn ) G L 

Gi -(Gi + Gr, 



and 

i / E m G 

g = (A T CA)- 1 (A T Gb + f) = -M E m G n 



Gn 



E G 



6.2. The Laplace Transform 

The Laplace Transform is typically credited with taking dynamical problems into static problems. 
Recall that the Laplace Transform of the function h is 

POD 

(Ch)(s) = / e- st h(t)dt. 
Jo 

where s is a complex variable. We shall soon take a 2 chapter dive into the theory of complex 
variables and functions. But for now let us proceed calmly and confidently and follow the lead of 
Matlab . For example 



>> syms t 

>> laplace (exp (t ) ) 

ans = 1/ (s-1) 

>> laplace (t*exp (-t) ) 

ans = 1/ (s + 1) 2 



The Laplace Transform of a matrix of functions is simply the matrix of Laplace transforms of 
the individual elements. For example 




Now, in preparing to apply the Laplace transform to (6.3) we write it as 

Cx' = C(Bx + g) (6.5) 

and so must determine how C acts on derivatives and sums. With respect to the latter it follows 
directly from the definition that 

C{Bx + g)= CBx + Cg = BCx + Cg. (6.6) 
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Regarding its effect on the derivative we find, on integrating by parts, that 



i>oo POO 

Cx' = / e~ st x'(t) dt = x(t)e" st + s / e~ st x(t) dt. 
Jo o Jo 

Supposing that x and s are such that x(t)e~ st — > as t — > oo we arrive at 

Cx' = sCx-x(0). (6.7) 
Now, upon substituting (6.6) and (6.7) into (6.5) we find 

sCx — x(0) = BCx + Cg, 
which is easily recognized to be a linear system for Cx, namely 

(sI-B)Cx = Cg + x(0). (6.8) 

The only thing that distinguishes this system from those encountered since chapter 1 is the presence 
of the complex variable s. This complicates the mechanical steps of Gaussian Elimination or the 
Gauss- Jordan Method but the methods indeed apply without change. Taking up the latter method, 
we write 

Cx = (si - B)- 1 (Cg + x(ti)). 

The matrix (si — B)^ 1 is typically called the resolvent of B at s. We turn to Matlab for its 
symbolic calculation. For example, 

>> syms s 

>> B = [2 -1;-1 2] 

>> R = inv (s*eye (2) -B) 

R = 

[ (s-2) / (s*s-4*s+3) , -1/ (s*s-4*s+3) ] 

[ -1/ (s*s-4*s+3) , (s-2) / (s*s-4*s + 3) ] 
We note that (si — B)^ 1 is well defined except at the roots of the quadratic, s 2 — As + 3. This 
quadratic is the determinant of (sI—B) and is often referred to as the characteristic polynomial 
of B. The roots of the characteristic polynomial are called the eigenvalues of B. We will develop 
each of these new mathematical objects over the coming chapters. We mention them here only to 
point out that they are all latent in the resolvent. 

As a second example let us take the B matrix of (6.4) with the parameter choices specified in 
f ib3 .m, namely 

1 f~ 5 3 

B = — 3 -8 3 I . (6.9) 
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The associated resolvent is 



where 



, js 2 + 1.3s + 0.31 0.3s + 0.15 0.09 
(sI-BY l = — — 0.3s + 0.15 s 2 + s + 0.25 0.3s + 0.15 

XB ( S ) \ 0.09 0.3s + 0.15 s 2 + 1.3s + 0.31 



XB (s) = s 3 + 1.8s 2 + 0.87s + 0.11 (6.10) 
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is the characteristic polynomial of B. Assuming a current stimulus of the form io(t) = t exp(— 1/4)/1000, 
and E m = brings 

/l.965/(s + 1/4) 2N 

V 

and so (6.10) persists in 

Cx = (sI-B)- l Cg 

fs 2 ■ 1.3,- ■ ().:', l 
1 0.3s + 0.15 
0.09 



(s + l/4) 2 (s 3 + 1.8s 2 + 0.87s + 0.11) 



Now comes the rub. A simple linear solve (or inversion) has left us with the Laplace transform of 
x. The accursed No Free Lunch Theorem informs us that we shall have to do some work in order 
to recover x from Cx. 

In coming sections we shall establish that the inverse Laplace transform of a function h is 

(C- 1 h)(t) = — I h(s)exp(st)ds, (6.11) 
2m J c 

where s runs along C, a closed curve in the complex plane that encircles all of the singularities of 
h. We don't suppose the reader to have yet encountered integration in the complex plane and so 
please view (6.11) as a preview pf coming attractions. 

With the inverse Laplace transform one may express the solution of (6.3) as 

x(t) = C~\sl - B)-\Cg + x(0)). (6.12) 

As an example, let us take the first component of Cx, namely 

1.965(s 2 + 1.3s + 0.31) 
1[S) (s + l/4) 2 (s3 + 1.8s 2 + 0.87s + 0.11)) ' 1 j 

The singularities, or poles, are the points s at which Cxi(s) blows up. These are clearly the roots 
of its denominator, namely 

-11/10, -1/2, -1/5 and -1/4, (6.14) 

and hence the curve, C, in (6.11), must encircle these. We turn to Matlab however to actually 
evaluate (6.11). Referring to f ib3 .m for details we note that the ilaplace command produces 



xi(f) = 1.965 ( 8e" t/2 - (l/17) e -' /4 (40912/17 + 76*) 

(6.15) 

+ (400/3)e"* /5 + (200/867)e- lli/10 
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Figure 6.2. The 3 potentials associated with figure 6.1. 

The other potentials, see the figure above, possess similar expressions. Please note that each of the 
poles of Cx\ appear as exponents in x\ and that the coefficients of the exponentials are polynomials 
whose degrees are determined by the orders of the respective poles, f ib3 .m 

6.3. The Backward-Euler Method 

Where in the previous section we tackled the derivative in (6.3) via an integral transform we 
pursue in this section a much simpler strategy, namely, replace the derivative with a finite difference 
quotient. That is, one chooses a small dt and 'replaces' (6.3) with 

m--<t-dt) = m)+g(t) (616) 

The utility of (6.16) is that it gives a means of solving for x at the present time, t, from knowledge 
of x in the immediate past, t — dt. For example, as x(0) = x(0) is supposed known we write (6.16) 

as 

(I/dt - B)x(dt) = x{0)/dt + g{dt). 
Solving this for x(dt) we return to (6.16) and find 

(I/dt - B)x(2dt) = x(dt)/dt + g(2dt) 

and solve for x(2dt). The general step from past to present, 

x(jdt) = (I/dt - B)-\x((j - l)dt)/dt + g(jdt)), (6.17) 

is repeated until some desired final time, Tdt, is reached. This equation has been implemented in 
f ib3 . m with dt = 1 and B and g as above. The resulting x (run f ib3 yourself!) is indistinguishable 
from figure 6.2. 

Comparing the two representations, (6.12) and (6.17), we see that they both produce the solution 
to the general linear system of ordinary equations, (6.3), by simply inverting a shifted copy of B. 
The former representation is hard but exact while the latter is easy but approximate. Of course we 
should expect the approximate solution, x, to approach the exact solution, x, as the time step, dt, 
approaches zero. To see this let us return to (6.17) and assume, for now, that g = 0. In this case, 
one can reverse the above steps and arrive at the representation 

x(jdt) = ((I - dtBy v ) j x(Q). (6.18) 
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Now, for a fixed time t we suppose that dt = t/j and ask whether 

x(t)= lim((/-(t/j) J B)- 1 )M0). 

This limit, at least when B is one-by-one, yields the exponential 

x(t) = exp(Bt)x(0), 

clearly the correct solution to (6.3). A careful explication of the matrix exponential and its 
relationship to (6.12) will have to wait until we have mastered the inverse Laplace transform. 

6.4. Dynamics of Mechanical Systems 

Regarding the fiber nets of Chapter 2, we may move from the equilibrium equations, for the 
displacement x due to a constant force, /, 

Sx = f, where S = A T KA, 

to the dynamical equations for the displacement, x(t), due to a time varying force, f(t), and or 
nonequilibrium initial conditions, by simply appending the Newtonian inertial terms, i.e., 

Mx"(t) + Sx(t) = f(t), x(0) = x , x'(0) = v , (6.19) 

where M is the diagonal matrix of node masses, xq denotes their initial displacement and vq denotes 
their initial velocity. 

We transform this system of second order differential equations to an equivalent first order system 
by introducing 

ui = x and u 2 = u\ 
and then noting that (6.19) takes the form 

u ' 2 = x " = -M- l S Ul + M- l f\t). 
As such, we find that u = (u\ w 2 ) T obeys the familiar 

u' = Bu + g, «(0) = M (6.20) 

where 

B =(-A/% o)' *=(A)' "° = G)' (621> 

Let us consider the concrete example of the chain of three masses in Fig. 2.1. If each node has mass 
m and each spring has stiffness k then 

u ( 2 -1 \ 

(6.22) 

0-12/ 
The associated characteristic polynomial of B is 

XB (s) = s 6 + 6cs 4 + 10cV + 4c 3 , where c = k/m, (6.23) 
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is a cubic in s 2 with simple roots at —2c and — 2c± \[2c. And so the eigenvalues of B are the three 
purely imaginary numbers 



\ 1= iJ2c-V2c, \ 2 = iV2c, A 



\3 = % 



2c+V2c 



(6.24) 



and their complex conjugates, A4 = — Ai, A5 = — A2 and X e = —A3. Next, if the exogenous force, /, 
is 0, and the initial disturbance is simply xi(0) = 1 then 



Cui(s) 



Xb(s) 



'3c 2 s + Acs 3 + s 5N 
cs(s 2 + 2c) 



2 

c z s 



(6.25) 



On computing the inverse Laplace Transform we (will) find 

6 

Xl (t) = exp(A i t)(3c 2 A j + 4cA^ + Aj) exp(A^) 
3=1 



(s - A,) 



Xb(s) 



(6.26) 



s=A, 



that is, X\ is a weighted sum of exponentials. As each of the rates are purely imaginary it follows 
that our masses will simply oscillate according to weighted sums of three sinusoids. For example, 
note that 



exp(A2t) = cos(v2ct) +isin(v2ct) and exp(Ast) = cos(v2ct) — isin(v2ct). 

Of course such sums may reproduce quite complicated behavior. We have illustrated each of the 
displacements in Figure 6.3 in the case that c = 1. Rather than plotting the complex, yet explicit, 
expression (6.26) for x\, we simply implement the Backward Euler scheme of the previous section. 




10 20 30 40 50 60 70 



Figure 6.3. The displacements of the 3 masses in Fig. 2.1, with k/m = 1, following an initial dis- 
placement of the first mass. For viewing purposes we have offset the three displacements, chain . m 
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6.5. Exercises 



1. Compute, without the aid of a machine, the Laplace transforms of e* and te 1 . Show all of your 
work. 

2. Extract from f ib3 . m analytical expressions for x 2 and x 3 . 

3. Use eig to compute the eigenvalues of B as given in (6.9). Use poly to compute the charac- 
teristic polynomial of B. Use roots to compute the roots of this characteristic polynomial. 
Compare these to the results of eig. How does Matlab compute the roots of a polynomial? 
(type type roots) for the answer). Submit a MATLAB diary of your findings. 

4. Adapt the Backward Euler portion of f ib3 . m so that one may specify an arbitrary number 
of compartments, as in f ibl . m. As B, and so S, is now large and sparse please create the 
sparse B via spdiags and the sparse / via speye, and then prefactor S into LU and use 
U\L\ rather than S\ in the time loop. Experiment to find the proper choice of dt. Submit 
your well documented M-file along with a plot of x\ and X50 versus time (on the same well 
labeled graph) for a 100 compartment cable. 

5. Derive (6.18) from (6.17) by working backwards toward x(0). Along the way you should explain 
why (I/dt - B)~ 1 /dt = (/ - dtB)- 1 . 

6. Show, for scalar B, that ((1 — (t/j)B)~ l y — > exp(Bt) as j — > 00. Hint: By definition 

((1 - (t/j)B)- l y = exp(jlog(l/(l - (t/j)B))) 
now use L'Hopital's rule to show that j log(l/(l — (t/j)B)) — > Bt. 

7. If we place a viscous damper in parallel with each spring in Figure 2.1 as below 




then the dynamical system (6.20) takes the form 

Mx"(t) + Dx'{t) + Sx{t) = f{t) (6.27) 

where D = A T diag(d)A where d is the vector of damping constants. Modify chain . m to solve 
this new system and use your code to reproduce the figure below. 
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t 

Figure 6.1. The displacement of the three masses in the weakly damped chain, where k/m — 1 
and d/m = 1/10. 
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7. Complex Numbers and Functions 



7.1. Complex Algebra 

A complex number is simply a pair of real numbers. In order to stress however that the two 
algebras differ we separate the two real pieces by the symbol +i. More precisely, each complex 
number, z, may be uniquely expressed by the combination x + iy where x and y are real and % 
denotes \f—\. We call x the real part and y the imaginary part of z. We now summarize the 
main rules of complex arithmetic. 

If 

Z1—X1 + iyi and z 2 = x 2 + iy 2 



then 



z\ + z 2 = (xi + x 2 ) + i(yi + y 2 ) 

z\z 2 = (x 1 + iyi){x 2 + iy 2 ) = (x x x 2 - y x y 2 ) + i{x x y 2 + x 2 y x ) 
z 1 = x 1 - iy u 

£1 _ £1^2 _ (xix 2 + y x y 2 ) + i{x 2 y x - x x y 2 ) 
z 2 z 2 z 2 x\ + y\ 



\z x \ = \fz^z\ = yjxl + yf, 
In addition to the Cartesian representation z = x + iy one also has the polar form 

z = |^|(cos^ + % sin^), where 9 e (— 7r, 7r] and 

'tt/2, if x = 0, y > 0, 



9 = atan2(y, x) 



-tt/2, if x = 0, y < 0, 

arctan(y/a;) if x > 0, 

arctan(t//x) + n if x < 0, y > 0, 

k arctan(y/a;) — 7r if rr < 0, y < 0. 



This form is especially convenient with regards to multiplication. More precisely, 

ziz 2 = \zi I \z 2 1 {(cos 9i cos9 2 — sin9 1 sin9 2 ) + i(cos9i sin9 2 + sin9i cos^ 2 )} 
= \z 1 \\z 2 \{cos(9 1 + 9 2 ) + i sin(9 1 + 9 2 )}. 

As a result, 

z n = I z 1 11 (cos (n0) +ism(n9)). 

A complex vector (matrix) is simply a vector (matrix) of complex numbers. Vector and matrix 
addition proceed, as in the real case, from elementwise addition. The dot or inner product of two 
complex vectors requires, however, a little modification. This is evident when we try to use the old 
notion to define the length of complex vector. To wit, note that if 



z = 



l + i 
l-i 



then 

z T z = (1 + if + (1 - if = 1 + 2i - 1 + 1 - 2i - 1 = 0. 
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Now length should measure the distance from a point to the origin and should only be zero for the 
zero vector. The fix, as you have probably guessed, is to sum the squares of the magnitudes of 
the components of z. This is accomplished by simply conjugating one of the vectors. Namely, we 
define the length of a complex vector via 



\\ Z \\ EE \/Wz. (7.1) 

In the example above this produces 



VI 1 +i\ 2 + |1 ~i\ 2 = 41 = 2. 

As each real number is the conjugate of itself, this new definition subsumes its real counterpart. 

The notion of magnitude also gives us a way to define limits and hence will permit us to introduce 
complex calculus. We say that the sequence of complex numbers, {z n : n — 1,2, . . .}, converges to 
the complex number z and write 

z n -> z or z = lim z n , 

n^oo 

when, presented with any e > one can produce an integer N for which \z n — z \ < e when n > N. 
As an example, we note that (i/2) n — > 0. 

7.2. Complex Functions 

A complex function is merely a rule for assigning certain complex numbers to other complex 
numbers. The simplest (nonconstant) assignment is the identity function f(z) = z. Perhaps the 
next simplest function assigns to each number its square, i.e., f(z) = z 2 . As we decomposed the 
argument of /, namely z, into its real and imaginary parts, we shall also find it convenient to 
partition the value of /, z 2 in this case, into its real and imaginary parts. In general, we write 

f(x + iy) = u(x, y) + iv(x, y) 

where u and v are both real-valued functions of two real variables. In the case that f(z) = z 2 we 
find 

u(x,y)=x 2 — y 2 and v(x,y) = 2xy. 
With the tools of the previous section we may produce complex polynomials 

f(z) =z m + c m - x z m - x + • • • + CiZ + c . 

We say that such an / is of degree m. We shall often find it convenient to represent polynomials 
as the product of their factors, namely 

f(z) = (z- A x ) mi (z - A 2 ) m2 • • • (z - \ h ) m \ (7.2) 

Each Xj is a root of / of degree rrij. Here h is the number of distinct roots of /. We call \j a 
simple root when m 3 ■, — 1. In chapter 6 we observed the appearance of ratios of polynomials or so 
called rational functions. Suppose 

rM = M 
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is rational, that / is of order at most to— 1 while g is of order to with the simple roots {Ai, . . . , A m }. 
It should come as no surprise that such a r should admit a Partial Fraction Expansion 



r ( z ) = Yl 



i Z - ^3 

One uncovers the rj by first multiplying each side by (z — Xj) and then letting z tend to Xj. For 
example, if 

1 = -^ + -DL (7 .3) 
z z + 1 z + % z — I 

then multiplying each side by (z + i) produces 

1 r 2 (z + i) 

- = n + 



z — i z — i 



Now, in order to isolate r\ it is clear that we should set z = —i. So doing we find r 1 = i/2. In order 
to find r 2 we multiply (7.3) by (z — i) and then set z — i. So doing we find r 2 = and so 

: + r • (7.4) 



z 2 + 1 z + i 

Returning to the general case, we encode the above in the simple formula 



Tj = \im(z- Xj)r(z). (7.5) 

Z->\j 

You should be able to use this to confirm that 

z V 2 I/ 2 

= — 1 — . (7.6) 

z 2 + lz + iz — i 

Recall that the resolvent we met in Chapter 6 was in fact a matrix of rational functions. Now, the 
partial fraction expansion of a matrix of rational functions is simply the matrix of partial fraction 
expansions of each of its elements. This is easier done than said. For example, the resolvent of 



B 



1 
-1 



is 

(zI-BY 



z 



2 +1 V-l 



z 



(7.7) 

1/2 i/2\ ^ 1 (1/2 -i/2 x 



~ z + i \-i/2 l/2j T z -j \i/2 1/2 

The first line comes from either Gauss-Jordan by hand or via the symbolic toolbox in Matlab . 
More importantly, the second line is simply an amalgamation of (7.3) and (7.4). Complex matrices 
have finally entered the picture. We shall devote all of Chapter 9 to uncovering the remarkable 
properties enjoyed by the matrices that appear in the partial fraction expansion of (zl — B)^ 1 . 
Have you noticed that, in our example, the two matrices are each projections, that they sum to /, 
and that their product is 0? Could this be an accident? To answer this we will also need to develop 
{zl — B)~ l in a geometric expansion. At its simplest, the n-term geometric series, for z ^ 1, is 

E** = \zV- (7 - 8) 

k=0 
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We will prove this in the exercises and use it to appreciate the beautiful orthogonality of the columns 
of the Fourier Matrix. 

In Chapter 6 we were confronted with the complex exponential when considering the Laplace 
Transform. By analogy to the real exponential we define 



e z 

n=0 



E 

and find that 

e w = 1 + %9 + (i9) 2 /2 + (i9) 3 /3\ + (i9) 4 /4\ + ■■■ 

= (1 - 9 2 /2 + 9 4 /A\ )+i(9- 9 3 /3\ + 9 5 /5\ ) 

= cos 6 + % sin#. 

With this observation, the polar form is now simply z — \z\e 10 . One may just as easily verify that 

_|_ g— id giS e~*^ 

cos 9 = and sin 9 



2 2i 
These suggest the definitions, for complex z, of 

Q iZ _|_ Q~ iZ 

cosz = and sin z = 



2 2i 
As in the real case the exponential enjoys the property that 



and in particular 



gZl+Z2 _ g 2 lg z 2 



Q x+iy _ Q x e iy _ gZ cog y _j_ ^ y 



Finally, the inverse of the complex exponential is the complex logarithm, 

In z = ln(|z|) + i9, for z = \z\e ld . 
One finds that ln(-l + i) = In ^2 + i3n/A. 
7.3. Complex Differentiation 

The complex function / is said to be differentiable at z if 

z—>z Z — Zq 

exists, by which we mean that 

f(z n ) - f(zp) 

converges to the same value for every sequence {z n } that converges to z . In this case we naturally 
call the limit f'(z ). 

Example: The derivative of z 2 is 2z. 

hm Z -^L = lim ( "-"° )(Z + " o) = 2, . 

z^z Z — Zq z->z Z — Zq 
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Example: The exponential is its own derivative. 

e z — e z ° 9 z-z _ i °° ( z — za\ n 

lim f e — = l im ? i = e *° lim V \ , 

z->zo -2 — ^0 z^zo Z — z-»z in + 1)! 

n=0 



e 2 °. 



Example: The real part of z is not a differentiable function of z. 

We show that the limit depends on the angle of approach. First, when z n — > ^ on a line parallel 
to the real axis, e.g., z n = xo + 1/n + iy Q , we find 

,. x + l/n-x 

hm 



n^oo x + 1/n + iyo - (^o + Wo) 
while if 2 n — > 2 in the imaginary direction, e.g., z n = Xo + i(yo + 1/n), then 

hm — : — - = 0. 

n^oo x + i(y + 1/n) - (x + iy ) 

This last example suggests that when / is differentiable a simple relationship must bind its 
partial derivatives in x and y. 

Proposition 7.1. If / is differentiable at z then 

ft \ d f ( \ d f I \ 
f{z ) = -{z Q ) = ^-{z Q ). 

Proof: With z = x + iy , 

f(z ) = Hm m ~ fiZ0) = hm /(* + *yo)-/(*° + *yo) = |/(, o) . 

z^z Z — Zq x^x x — Xo OX 



Alternatively, when z = xq + iy then 



x r f(z)-f(z ) f(x + iy) - f(x + iy ) .df 

f (z ) = hm = hm - = -i — (z ). 

z^zo z - z y^yo i(y - y ) dy 

End of Proof. 

In terms of the real and imaginary parts of / this result brings the Cauchy-Riemann equa- 
tions 



du dv dv du 

dx dy dx dy 



(7.9) 



Regarding the converse proposition we note that when / has continuous partial derivatives in region 
obeying the Cauchy-Riemann equations then / is in fact differentiable in the region. 

We remark that with no more energy than that expended on their real cousins one may uncover 
the rules for differentiating complex sums, products, quotients, and compositions. 

As one important application of the derivative let us attempt to expand in partial fractions a 
rational function whose denominator has a root with degree larger than one. As a warm-up let us 
try to find r± t i and r 12 in the expansion 

Z + 2 n,i 7-1,2 



(z + i) 2 z + i (z + iy 
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Arguing as above it seems wise to multiply through by (z + l) 2 and so arrive at 

z + 2 = r Xl (z + 1) + r X2 . (7.10) 

On setting z — — 1 this gives r 1)2 = 1. With r lj2 computed (7.10) takes the simple form z + 1 — 
ri : i(z + 1) and so 7*1,1 = 1 as well. Hence 

z + 2 1 1 

+ 



(z + 1) 2 z + 1 (z + 1) 2 ' 
This latter step grows more cumbersome for roots of higher degree. Let us consider 

(z + 2) 2 _ ri,! | n,2 | ^1,3 



(z+lf Z+1 (z+1) 2 (z+lf 

The first step is still correct: multiply through by the factor at its highest degree, here 3. This 
leaves us with 

(z + 2f = r Xl (z + if + r X2 (z + 1) + n, 3 . (7.11) 

Setting z — — 1 again produces the last coefficient, here 7*1,3 = 1. We are left however with one 
equation in two unknowns. Well, not really one equation, for (7.11) is to hold for all z. We exploit 
this by taking two derivatives, with respect to z, of (7.11). This produces 

2(z + 2) = 2r hl (z+ 1) +n >2 and 2 = 2r hl . 

The latter of course needs no comment. We derive 7*1,2 from the former by setting z = —1. We 
generalize from this example and arrive at 

Proposition 7.2. The First Residue Theorem. The ratio, r = f/g, of two polynomials where 
the order of / is less than that of g and g has h distinct roots {Ai, . . . , A^} of respective degrees 
{mi, . . . , m h }, may be expanded in partial fractions 

h rrij 
j=l k=l ^ A ^ 

where, as above, the residue r 3 ^ is computed by first clearing the fraction and then taking the 
proper number of derivatives and finally clearing their powers. That is, 

1 d m i~ k 

r jk = lim — r{(z - \ 3 ) m ir(z)\. (7.13) 

As an application, we consider 

B = I 1 3 1 (7.14) 




and compute the Rj t k matrices in the expansion of its resolvent 



^1 

R(z) = (zl - BY 1 = \ (J^_ 3 ) jzs 



,(2-l) 2 (z-3) (2-l)(2-3) 2-1/ 

+ 7 rvi^ 1 ' 2 ^ 0-^2,1- 



1 1,1 {z-iy 
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The only challenging term is the (3,1) element. We write 



n.i t-1,2 r 2 ,i 



(2-l) 2 (2-3) 2-1 (2-1) 2 2-3' 

It follows from (7.13) that 

= t^)' (1) = -i a,ld ^=(r^)w = -5 (715) 

and 

- ((T^Tp) (») = j- < 7 - 16 > 

It now follows that 

i / i o o\ 1 / N 



l 

+ 



2-3 



1/4 -1/2 1/ L ) \-l/2 0, 

/ 0\ 

1/2 1 . 

\l/4 1/2 0/ 



(7.17) 



In closing, we that the method of partial fraction expansions has been implemented in Matlab . 
In fact, (7.15) and (7.16) follow from the single command 

[r,p] =residue ( [0 1] , [1 -5 7 -3]). 

The first input argument is MATLAB -speak for the polynomial f(z) — 1 while the second argument 
corresponds to the denominator 

g( Z ) = (2 - 1) 2 (2 - 3) = 2 3 - 52 2 + 7^-3. 

7.4. Exercises 

1. Express |e x+lJ '| in terms of x and/or y. 

2. Suppose 2 7^ 1 and define the n-term geometric series 

71-1 

k=0 

and show, by brute force, that a — za = 1 — z n . Derive (7.8) from this result. 

3. Confirm that e lnz = z and hie 2 = z. 

4. Find the real and imaginary parts of cos 2 and sin 2. Express your answers in terms of regular 
and hyperbolic trigonometric functions. 

5. Show that cos 2 2 + sin 2 2 = 1. 
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6. As in the real calculus, the exponential and natural log permit one to define arbitrary powers. 
Please compute \fi via the definition z p = e plnz . 

7. Carefully sketch, by hand, the complex numbers u n = exp(27ri/n) for n = 2, 4 and 8. uo n is 
called an nth root of unity. Please compute by hand. Construct the n-by-n Fourier Matrix, 
F n , via 

F n (j,k) = ^ r ^- 1){k - 1) , (7.18) 



where the row index, j, and the column index, k, each run from 1 to n. We will now prove 
that the conjugate of F n is in fact the inverse of F n , i.e., that F n F* = I. To do this first show 
that the (j, m) element of the product F n F* is 

n 1 n 

J2 F (j,k)F*{k,m) = -^exp(27ri(£;- l)(j-m)/n). (7.19) 

k=l U k=l 

Conclude that this sum is 1 when j = m. To show that the sum is zero when j ^ m set 
z = exp(27ri(j — m)/n) and recognize in (7.19) the n-term geometric series of (7.8). 

8. Verify that sin z and cos z satisfy the Cauchy-Riemann equations (7.9) and use Proposition 7.1 
to evaluate their derivatives. 

9. Compute, by hand the partial fraction expansion of the rational function that we arrived at in 
(6.13). That is, find r^i, ri j2 , r 2 , r 3 and r 4 in 

s 2 + 1.3s + 0.31 

r(s) = 



(s + l/4) 2 (s + l/2)(s + l/5)(s + 11/10) 
n,i r\2 r 2 r 3 r 4 



s + 1/4 (s + 1/4) 2 s + 1/2 s + 1/5 s + 11/10 ' 

Contrast your answer with the explicit expression in (6.15). 

10. Submit a MATLAB diary documenting your use of residue in the partial fraction expansion 
of resolvent of 

You should achieve 







V2 






2 






V2 
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8. Complex Integration 

Our main goal is a better understanding of the partial fraction expansion of a given resolvent. 
With respect to the example that closed the last chapter, see (7.14)-(7.17), we found 

(zl - B)- 1 = -^-Pi + - f \ v^ -Pi + ~^-P2 
z — Ai (z — z — X 2 

where the Pj and Dj enjoy the amazing properties 

BP l = P X B = \ l P 1 +D 1 and BP 2 = P 2 B = \ 2 P 2 

P 1 + P 2 = I, P? = Pu Pi = P 2 , and Dl = 0, 
P l D l = D X P X = D l and P 2 D X = D X P 2 = 0. 

In order to show that this always happens, i.e., that it is not a quirk produced by the particular 
B in (7.14), we require a few additional tools from the theory of complex variables. In particular, 
we need the fact that partial fraction expansions may be carried out through complex integration. 

8.1. Cauchy's Theorem 

We shall be integrating complex functions over complex curves. Such a curve is parametrized 
by one complex valued or, equivalently, two real valued, function(s) of a real parameter (typically 
denoted by t). More precisely, 

C = {z{t) = x(t) + iy(t) : h < t < t 2 }. 
For example, if x(t) = y(t) = t from t\ — to t 2 — 1, then C is the line segment joining + iO to 

We now define t 

f( z )dz= r f(z(t))z f (t)dt. 

For example, if C = {t + it : < t < 1} as above and f(z) = z then 



c 



[ zdz= [ (t + it)(l + i) dt = [ {(t - t) + i2t} dt 
Jc Jo Jo 



while if C is the unit circle {e lt : < t < 2n} then 

p p2tt /'27r /'2-7T* 

/ z dz = / e u ie u dt = i e i2t dt = i {cos(2t) + i sin(2t)} dt = 0. 
Jc Jo Jo Jo 

Remaining with the unit circle but now integrating f(z) = 1/z we find 

z - 1 dz= / e- u ie u dt = 2iri. 



c 

We generalize this calculation to arbitrary (integer) powers over arbitrary circles. More precisely, 
for integer m and fixed complex a we integrate (z — a) m over 

C(a, p) = {a + pe u : < t < 2tt}, 
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the circle of radius p centered at a. We find 

1-2-K 

z-a) m dz= / (a + pe u -a) m pie u dt 

C[a,p) JO 

/2tt 
e l{m+1)t dt 

r2TT (8.1) 

ip m+1 / {cos((m + l)t) + i sin((m + l)t)} dt 
Jo 

2iri if m — — 1, 
otherwise, 

regardless of the size of p\ 

When integrating more general functions it is often convenient to express the integral in terms 
of its real and imaginary parts. More precisely 

/ f(z)dz= / {u(x,y) +iv(x,y)}{dx + idy} 
Jc Jc 

= / {u(x,y)dx - v(x,y)dy} + i / {u(x, y) dy + v(x, y) dx} 
Jc Jc 

= [ {u(x(t),y(t))x'(t)-v(x(t),y(t))y'(t)}dt 

J a 

+ i [ {u(x(t),y(t))y'(t) + v(x(t),y(t))x'(t)} dt. 



The second line should invoke memories of 

Proposition 8.1. Green's Theorem. If C is a closed curve and M and iV are continuously 
differentiable real- valued functions on Cj„, the region enclosed by C, then 

L {Mix+Niv) =ILM- d -^) ixiv 

Applying this proposition to the situation above, we find, so long as C is closed, that 

L md * = -JL(£ + %) dxdv + ' SL ® " I) 

At first glance it appears that Green's Theorem only serves to muddy the waters. Recalling the 
Cauchy-Riemann equations however we find that each of these double integrals is in fact identically 
zero! In brief, we have proven 

Proposition 8.2. Cauchy's Theorem. If / is differentiable on and in the closed curve C then 

f{z)dz = 0. 

c 

Strictly speaking, in order to invoke Green's Theorem we require not only that / be differentiable 
but that its derivative in fact be continuous. This however is simply a limitation of our simple mode 
of proof, Cauchy's Theorem is true as stated. 
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This theorem, together with (8.1), permits us to integrate every proper rational function. More 
precisely, if r = f/g where / is a polynomial of degree at most m — 1 and g is an mth degree 
polynomial with h distinct zeros at {Xj}^ =1 with respective multiplicities of {rrij}^ =l we found that 

h rnj 

j=i k=i K 3J 

Observe now that if we choose the radius pj so small that Xj is the only zero of g encircled by 
Cj = C(\j,pj) then by Cauchy's Theorem 

f mj r i 

In (8.1) we found that each, save the first, of the integrals under the sum is in fact zero. Hence 

/ r(z) dz = 2irir j:1 . (8.3) 

With Tj i in hand, say from (7.13) or residue, one may view (8.3) as a means for computing the 
indicated integral. The opposite reading, i.e., that the integral is a convenient means of expressing 
rj i, will prove just as useful. With that in mind, we note that the remaining residues may be 
computed as integrals of the product of r and the appropriate factor. More precisely, 

/ r(z)(z - A,-)*" 1 dz = 2mr jik . (8.4) 

One may be led to believe that the precision of this result is due to the very special choice of curve 
and function. We shall see ... 

8.2. The Second Residue Theorem 

After (8.3) and (8.4) perhaps the most useful consequence of Cauchy's Theorem is the freedom 
it grants one to choose the most advantageous curve over which to integrate. More precisely, 

Proposition 8.3. Suppose that C2 is a closed curve that lies inside the region encircled by the 
closed curve C\. If / is differentiable in the annular region outside C2 and inside C\ then 

/ f(z)dz= [ f(z)dz. 

Proof: With reference to the figure below we introduce two vertical segments and define the 
closed curves C3 = abcda (where the be arc is clockwise and the da arc is counter-clockwise) and 
C4 = adeba (where the ad arc is counter-clockwise and the cb arc is clockwise). By merely following 
the arrows we learn that 

I f(z)dz= I f(z)dz+ I f(z)dz+ I f(z)dz. 

As Cauchy's Theorem implies that the integrals over C3 and C4 each vanish, we have our result. 
End of Proof. 
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Figure 8.1. The Curve Replacement Lemma. 

This proposition states that in order to integrate a function it suffices to integrate it over regions 
where it is singular, i.e., nondifferentiable. 
Let us apply this reasoning to the integral 

dz 



'c ( z ~ ^i)( z ~ A 2 ) 

where C encircles both Ai and A 2 as depicted in the cartoon on the next page. We find that 

z dz f zdz 

dz = / -, —-z + 



J c (z - Ai)(2 - A 2 ) " J Cl (z - Xi)(z - A 2 ) Jc 2 (z - Xi)(z - A 2 ) 
Developing the integrand in partial fractions we find 

zdz Ai f dz A 2 f dz 



Ci K z ~ ^i)( z — ^2) Ai — A 2 J Cl z — Ai A 2 — Ai J Cl z — A 2 

27riAx 
Ai — A 2 

Similarly, 

I* z dz 27riA 2 

Jc 2 (z - Ai)(z - A 2 ) A 2 -Ai' 

Putting things back together we find 

dz = 2m [ X \ + - X \ ) = 2ni. (8.5) 



q [z — Ai)(z — A2) \Ai — A2 A2 — A 



Replace this contour 




Now, as we squeeze the neck (and round the 
heads) we find that the two sides of the neck 
cancel and we arrive at the two contours 

Figure 8.2. Concentrating on the poles. 
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We may view (8.5) as a special instance of integrating a rational function around a curve that 
encircles all of the zeros of its denominator. In particular, recalling (8.2) and (8.3), we find 



. h rnj h 

r(z)dz = YY , Tj '1 , h dz = 2niY 



h rnj h 

(8.6) 



To take a slightly more complicated example let us integrate f(z) /(z — a) over some closed curve 
C inside of which / is differentiable and a resides. Our Curve Replacement Lemma now permits us 
to claim that 

c z ~ a Jc(a, P ) z — a 

It appears that one can go no further without specifying /. The alert reader however recognizes 
that the integral over C(a, p) is independent of r and so proceeds to let r — > 0, in which case z — > a 
and f(z) — > f(a). Computing the integral of l/(z — a) along the way we are lead to the hope that 



f l^-dz = f(a)2m 
Jc z ~ a 



In support of this conclusion we note that 



la 



c(a, P ) z - a Jc(a, P ) {z — a z - a z - a 



{a) f ^ i:+ r u^m iz . 

JC(a,p) z - a JC(a,p) z - a 



IC(a,p) Z-U Jc(a,p) 

Now the first term is f(a)2m regardless of p while, as p — > 0, the integrand of the second term 
approaches f'(a) and the region of integration approaches the point a. Regarding this second term, 
as the integrand remains bounded as the perimeter of C(a,p) approaches zero the value of the 
integral must itself be zero. End of Proof. 

This result is typically known as 

Proposition 8.4. Cauchy's Integral Formula. If / is differentiable on and in the closed curve 
C then 

1 f f(z) 



f(a) = — / ^dz (8.7) 
2-ni J c z — a 

for each a lying inside C. 

The consequences of such a formula run far and deep. We shall delve into only one or two. First, 
we note that, as a does not lie on C, the right hand side is a perfectly smooth function of a. Hence, 
differentiating each side, we find 



da 2m J c da z — a 2m J c [z — ay 



L f /(*) 

7" JC ( z - a ) 



for each a lying inside C. Applying this reasoning n times we arrive at a formula for the nth 
derivative of / at a, 

£/(„) = £[. M- iz (8.9) 
da nK ' 2m J c (z - aY+ n v ; 
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for each a lying inside C. The upshot is that once / is shown to be differentiable it must in fact be 
infinitely differentiable. As a simple extension let us consider 

1 t * 



2tt» Jo (z - Ai)(z - A 2 ) 2 

where / is still assumed differentiable on and in C and that C encircles both Ai and A2. By the 
curve replacement lemma this integral is the sum 

1 f m dz+ ±f , m ... 



2rri J Cl (z - Ai)(z - A 2 ) 2 2tu J C2 (2 - Ai)(z - A 2 ) 

where Aj now lies in only Cj. As f(z)/ (z — A 2 ) is well behaved in C\ we may use (8.7) to conclude 
that 

/(*) dz _ /(AO 



2mJ Cl ( Z -X 1 )(z-X 2 y (X 1 -X 2 y 
Similarly, As f(z)/(z — Ai) is well behaved in C 2 we may use (8.8) to conclude that 



-/ 

Jc, 



a=\2 



2ni J C2 (z - Xi)(z - A 2 ) 2 da(a-Xi) 
These calculations can be read as a concrete instance of 

Proposition 8.5. The Second Residue Theorem. If g is a polynomial with roots {Xj}j =1 of 
degree {rrij}^ =1 and C is a closed curve encircling each of the Xj and / is differentiable on and in 
C then 

h 



[ M d z = 2niJ2res(f/g,X j ) 

Jc 9K Z ) jr[ 



where 

1 d m i~ x { Hz 
res(f/g, A,) = lim j- -— (z - A - ' 



2— >Aj (rrij - 1)! d^ -1 \ fi 1 !- 2 ) 
is called the residue of //^ at Xj by extension of (7.13). 
One of the most important 'instances' of this theorem is the formula for 

8.3. The Inverse Laplace Transform 

If r is a rational function with poles {Xj}^ =1 then the inverse Laplace transform of r is 

{C- l r){t) = f r{z)e zt dz (8.10) 
2m J c 

where C is a curve that encloses each of the poles of r. As a result 

h 

(£ -1 r)(t) = ^res(r(z)e zt ,A j ). (8.11) 
j'=i 

Let us put this lovely formula to the test. We take our examples from chapter 6. Let us first 
compute the inverse Laplace Transform of 

r(z) = 



{z + iy 
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According to (8.11) it is simply the residue of r(z)e zt at z — — 1, i.e., 

d 



res(— 1) = lim — e 

z-*-i dz 



zt 



te 



This closes the circle on the example begun in §6.3 and continued in exercise 6.1. For our next 
example we recall from (6.13) (ignoring the leading 1.965), 



Cxi(s) 



(s 2 + 1.3s + 0.31) 



(s + l/4) 2 (s 3 + 1.8s 2 + 0.87s + 0.11)) 

(s 2 + 1.3s + 0.31) 
(s + l/4) 2 (s + l/2)(s + l/5)(s + 11/10) ' 



and so (8.11) dictates that 



Xi(t) = exp(-t/4) 



d 



fs 2 + 1.3s + 0.31) 



+ exp(-t/2) 
+ exp(-t/5) 



ds (s + 1/2) (s + l/5)(s + 11/10) 
(s 2 + 1.3s + 0.31) 



[s + l/4) 2 (s + l/5)(s + 11/10) 
(s 2 + 1.3s + 0.31) 



+ exp(-m/10) 



s + l/4) 2 (s + l/2)(s + 11/10) 
(s 2 + 1.3s + 0.31) 



s=-l/4 



s=-l/2 



s=-l/5 



(8.12) 



;s + l/4) 2 (s + l/2)(s + l/5) 



s=-10/ll 



Evaluation of these terms indeed confirms our declaration, (6.15), and the work in exercise 7.9. 

The curve replacement lemma of course gives us considerable freedom in our choice of the curve 
C used to define the inverse Laplace transform, (8.10). As in applications the poles of r are typically 
in the left half of the complex plane (why?) it is common to choose C to be the half circle 

c = c L ( P )uc A (p), 

comprised of the line segment, Cl, and arc, Ca, 

C L (p) = {iuj : -p < uj < p} and C A (p) = {pe ie : tt/2 < 9 < 3tt/2}, 

where p is chosen large enough to encircle the poles of r. With this concrete choice, (8.10) takes 
the form 



(£~V)(t) = / r(z)e zt dz + ^- [ r(z)e zt dz 
^ m Jc L 2m J Ca 



1 

27 



C A 
3tt/2 



r(iuj)e wt du + 



2tt 



r{pe w Y cl \ w d6. 



(8.13) 



vr/2 



Although this second term appears unwieldy it can be shown to vanish as p — > oo, in which case 
we arrive at 

1 



(£-V)(i) 



2vr 



r(iuj)e tuJt dv, 



(8.14) 



the conventional definition of the inverse Laplace transform. 
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8.4. Exercises 



1. Compute the integral of z 2 along the parabolic segment z(t) = t + it 2 as t ranges from to 1. 

2. Evaluate each of the integrals below and state which result you are using, e.g., The bare-handed 
calculation (8.1), Cauchy's Theorem, The Cauchy Integral Formula, The Second Residue The- 
orem, and show all of your work. 

f cos(z) f cos(z) f cos(z) 

JC(2,1) z — 2 Jc(2,i) z ( z — 2) Jc(2,i) z ( z + 2) 
f cos(z) f cos(z) f zcos(z) 
/ ~T^ dz > / — 3~ dz ^ / 7 

JC(0,2) Z + Z JC(0,2) z JC(0,2) Z - I 



dz. 



3. Let us confirm the representation (8.4) in the matrix case. More precisely, if R(z) = (zl — B) 1 
is the resolvent associated with B then (8.4) states that 

j=i k=i v ■>' 

where 

R ^ k= 2^J c R{Z){Z ~ Aj)fc_1 dZ - (8 - 15) 

Compute the Rj^ per (8.15) for the B in (7.14). Confirm that they agree with those appearing 
in (7.17). 

4. Use (8.11) to compute the inverse Laplace transform of l/(s 2 + 2s + 2). 

5. Use the result of the previous exercise to solve, via the Laplace transform, the differential 
equation 

x'(t) + x{t) = e~* sin t, x(0) = 0. 
Hint: Take the Laplace transform of each side. 

6. Evaluate all expressions in (8.12) in MATLAB 's symbolic toolbox via syms, dif f and subs 
and confirm that the final result jibes with (6.15). 

7. Return to §6.6 and argue how one deduces (6.26) from (6.25). Evaluate (6.26) and graph it 
with ezplot and contrast it with Fig. 6.3. 

8. Let us check the limit we declared in going from (8.13) to (8.14). First show that 

Next show (perhaps graphically) that 

cos# < 1 - 29 /n when vr/2 < 9 < n. 
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Now confirm each step in 

-3vr/2 



P 



■k/2 



3tt/2 
tt/2 



< p max \r(pe l 

pit 

= pmax |r(pe ie )|2 / 
9 ' Jn/: 



e pei9t \d9 



e ptcosd d6 
<pmax|r(pe* e )|2 T e pt{1 ~ 2d/n) d6 

6 Jn/2 

= max\r{pe id )\{n/t){l - e~ pt ), 



and finally argue why 



as p — > oo. 



max |r(pe^)| — > 
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9. The Eigenvalue Problem 



Harking back to chapter 6 we labeled the complex number A an eigenvalue of B if XI — B was 
not invertible. In order to find such A one has only to find those s for which (si — B)^ 1 is not 
defined. To take a concrete example we note that if 




B = ( 1 | (9.1) 

then 



>-l)(s-2) s-2 

(sI-B)- 1 = 7 ^ ^( (s-l)(s-2) | (9.2) 

(s 



(8-1)1(8-2) 




and so Ai = 1 and A2 = 2 are the two eigenvalues of B. Now, to say that Xjl — B is not invertible 
is to say that its columns are linearly dependent, or, equivalently, that the null space M(XjI — B) 
contains more than just the zero vector. We call M(XjI — B) the jth eigenspace and call each of its 
nonzero members a jth eigenvector. The dimension oiAf(Xjl-B) is referred to as the geometric 
multiplicity of Xj. With respect to B above we compute J\f(X\I — B) by solving (I — B)x = 0, 




Clearly 

Af(XJ -B) = {c[l 0] T :cG R}. 
Arguing along the same lines we also find 

M(X 2 I-B) = {c[0 1] T : c G R}. 

That B is 3-by-3 but possesses only 2 linearly independent eigenvectors, much like our patient 
on the front cover, suggests that matrices can not necessarily be judged by the number of their 
eigenvectors. After a little probing one might surmise that -B's condition is related to the fact that 
Ai is a double pole of (si — B)^ 1 . In order to flesh out that remark and find a proper replacement 
for the missing eigenvector we must take a much closer look at the resolvent. 

9.1. The Resolvent 

One means by which to come to grips with (si — B)' 1 is to treat it as the matrix analog of the 
scalar function 

— (9.3) 
s — 

This function is a scaled version of the even simpler function 1/(1 — z). This latter function satisfies 
(recall the n-term geometric series (7.8)) 

' = 1 + z + z 2 + • • • + z n - x + (9.4) 



1-z 1-z 
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for each positive integer n. Furthermore, if \z\ < 1 then z n — > as n — > oo and so (9.4) becomes, 
in the limit, 



n=0 



the full geometric series. Returning to (9.3) we write 



1 lis 1 b b n ~ l b n 1 
+ + ■■■ + + 



s — b 1 — b/s s s 2 s n s n s — b' 

and hence, so long as \s\ > \b\ we find, 



oo 

— = -Y 



s — 6 s \ s 

n=0 v 

This same line of reasoning may be applied in the matrix case. That is, 

i r> Dn-1 p>n 

(si - Br = s-\l - B/sr = - + y 2 +••• + — + ^ ~ BY 1 , (9.5) 
and hence, so long as \s\ > \\B\\ where \\B\\ is the magnitude of the largest eigenvalue of B, we find 

oo 

(sJ-B)"^ a" 1 (9.6) 

n=0 

Although (9.6) is indeed a formula for the resolvent you may, regarding computation, not find it 
any more attractive than the Gauss- Jordan method. We view (9.6) however as an analytical rather 
than computational tool. More precisely, it facilitates the computation of integrals of R(s). For 
example, if C p is the circle of radius p centered at the origin and p > \\B\\ then 

n OO „ 

/ (si - B)- 1 ds = J2 Bn s- l - n ds = 2ttU. (9.7) 

Jc p n=0 JCp 

This result is essential to our study of the eigenvalue problem. As are the two resolvent identities. 
Regarding the first we deduce from the simple observation 

(s 2 I - B)- 1 - ( Sl I - B)- 1 = (s 2 I - B)-\s 1 I -B- s 2 I + B)( Sl I - B)' 1 

that 

R(s 2 ) - R(si) = (si - s 2 )R(s 2 )R( Sl ). (9.8) 
The second identity is simply a rewriting of 

(si - B)(sl - B)- 1 = (si - B)-\sI -B) = I, 

namely, 

BR(s) = R(s)B = sR(s) - I. (9.9) 
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9.2. The Partial Fraction Expansion of the Resolvent 

The Gauss-Jordan method informs us that R(s) will be a matrix of rational functions of s, 
with a common denominator. In keeping with the notation of the previous chapters we assume the 
denominator to have the h distinct roots, {\j}^ =1 with associated orders {mj}J =1 . 

Now, assembling the partial fraction expansions of each element of R we arrive at 

h rrij 
j=l k=l V J7 

where, recalling equation (8.4), Rj k is the matrix 

Rj,k = tt-. f R{z)(* - \-) fc_1 (9.H) 
To take a concrete example, with respect to (9.1) and (9.2) we find 

/l o\ /0 1 o\ /o o\ 

= 1 R lt2 = and R 2 ,i = . 
\0 0/ \0 0/ \0 1/ 

One notes immediately that these matrices enjoy some amazing properties. For example 

R 2 ±1 = #1,1, R\ x = R 2>1 , Ri,iR 2 ,i = 0, and R\ 2 = 0. (9.12) 

We now show that this is no accident. We shall find it true in general as a consequence of (9.11) 
and the first resolvent identity. 

Proposition 9.1. R^ x = R^-y. 

Proof: Recall that the C 3 - appearing in (9.11) is any circle about Xj that neither touches nor 
encircles any other root. Suppose that C 3 - and Cj are two such circles and Cj encloses Cj. Now 



and so 
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We used the first resolvent identity, (9.8), in moving from the second to the third line. In moving 
from the fourth to the fifth we used only 

/ —^—dw = 2m and / — ?— dz = 0. (9.13) 
Jc w ~ z Jc 1 w ~ z 

3 J 

The latter integrates to zero because Cj does not encircle w. End of Proof. 

Recalling definition 5.1 that matrices that equal their squares are projections we adopt the 
abbreviation 

Pj = R jA . 

With respect to the product PjPk, for j ^ k, the calculation runs along the same lines. The 
difference comes in (9.13) where, as Cj lies completely outside of C k , both integrals are zero. Hence, 

Proposition 9.2. If j ^ k then PjP k = 0. 

Along the same lines we define 

Dj = Rj,2 

and prove 

Proposition 9.3. If 1 < k < rrij - 1 then D) = Rj, k +i- D™ j = 0. 
Proof: For k and I greater than or equal to one, 

R jM1 R jt e +1 = 1 / R(z)(z - Xjfdz [ R(w)(w - Xj) e dw 
(2m) J c J c , 



[ R(z)R(w)(z-X J ) k (w~X J Ydwdz 
yzm) j c j c i 

J J 

j j m-RM k y dwdz 



(2m) 2 



(2m 



— l — [ R(w)(w-X 1 ) e [ ^ Xj) dzdw 
(2mf J c , 1 3) J Cj w-z 

J J 

^- I R(z)(z - Xj) k+i dz = R jtk+e+1 . 



because 



(W X ^ dw = 2m(z-X,Y and [ ^ Aj) * dz = 0. (9.14) 

Jc, w-z 



w — z 



With k = £ = 1 we have shown Rj 2 = Rj,3, i.e., D 2 = i? j 3 . Similarly, with k — 1 and £ = 2 we find 
Rj,2Rj,3 = Rj,4, i-e., Dj = Rj t 4, and so on. Finally, at k — rrij we find 



Dj> =R hrnj+1 = ^- / n{z)(z-\j)""<h =0 



by Cauchy's Theorem. End of Proof. 

66 



Of course this last result would be trivial if in fact Dj = 0. Note that if rrij > 1 then 



Dp' 1 = R j>mj = [ R(z)(z - Xj)™'- 1 dz ± 



for the integrand then has a term proportional to \j{z — Xj), which we know, by (8.1), leaves a 
nonzero residue. 

With this we now have the sought after expansion 

KM^jrh^+Eidb^}' (9 ' 15) 

along with verification of a number of the properties enjoyed by the Pj and Dj. 

In the case that each eigenvalue is simple pole of the resolvent, i.e., rrij = 1, we arrive at the 
compact representation 

n p 

(zI-Br^J^-^Y- (9-16) 

As a quick application of this we return to a curious limit that arose in our discussion of Backward 
Euler. Namely, we would like to evaluate 

lim (/ -{t/k)B)- k . 

k^oo 

To find this we note that (zl — B)^ 1 = (I — B / z)^ 1 / z and so (9.16) may be written 

' zP, A Pi 



If we now set z = k/t, where k is a positive integer, and use the fact that the Pj are projections 
that annihilate one another then we arrive first at 

and so, in the limit find 

h 

lim (/ - {t/k)B)~ k = V exp(A j t)P i (9.19) 

and naturally refer to this limit as exp(-Bt). We will confirm that this solve our dynamics problems 
by checking, in the Exercises, that (exp(Bt))' = Bexp(Bt). 

9.3. The Spectral Representation 

With just a little bit more work we shall arrive at a similar expansion for B itself. We begin by 
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applying the second resolvent identity, (9.9), to Pj. More precisely, we note that (9.9) implies that 

BPj = PjB = h \ {zR{z) ~ J) dz 



1 



zR(z) dz 

Cj (9.20) 



— / R(z)(z-X i )dz + ^ L [ R(z)dz 
*™ Jet 2m Jc, 



= Dj + XjPj, 

where the second equality is due to Cauchy's Theorem and the third arises from adding and sub- 
tracting XjR(z). Summing (9.20) over j we find 

h h h 

"E^-Ev^-E^- ( 9 - 21 ) 

j=i j=i j=i 

We can go one step further, namely the evaluation of the first sum. This stems from (9.7) where we 
integrated R(s) over a circle C p where p > \\B\\. The connection to the Pj is made by the residue 
theorem. More precisely, 

h 



/ R(z)dz = 27riJ2Pj. 

J C P 7=1 



Comparing this to (9.7) we find 



and so (9.21) takes the form 



h 

E P , =J ' (9.22) 



h h 

11 E A ^ , ? • E /; - ( 9 - 23 ) 

3=1 3=1 

It is this formula that we refer to as the Spectral Representation of B. To the numerous 
connections between the Pj and Dj we wish to add one more. We first write (9.20) as 

(B - X 3 I)P 3 = Dj (9.24) 

and then raise each side to the rrij power. As P™ 3 = Pj and D™ 3 = we find 

(B - XjI) m iPj = 0. (9.25) 

For this reason we call the range of Pj the jth generalized eigenspace, call each of its nonzero 
members a jth generalized eigenvector and refer to the dimension of TZ(Pj) as the algebraic 
multiplicity of Xj. Let us conform that the eigenspace is indeed a subspace of the generalized 
eigenspace. 

Proposition 9.4. N(XjI — B) C TZ(Pj) with equality if and only if rrij — 1. 

Proof: For each e G N{XjI — B) we show that Pjt = e. To see this note that Be = Xjt and so 
(si — B)e = (s — Xj)e so (si — B)~ l e — (s — Xj)^e and so 

PjC = / (si — B)~ 1 eds = / —eds = e. 

3 2mJ c . y 1 2mJ c .s-Xj 
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By (9.24) we see that H(Pj) C M{\jl - B) if and only if m 3 - = 1. End of Proof. 

With regard to the example with which we began the chapter we note that although it has only 
two linearly independent eigenvectors the span of the associated generalized eigenspaces indeed 
fills out R 3 , i.e., TZ(Pi) © ^■(■^2) = R 3 - One may view this as a consequence of Pi + P 2 = I, 
or, perhaps more concretely, as appending the generalized first eigenvector [0 1 0] T to the original 
two eigenvectors [1 0] T and [0 1] T . In still other words, the algebraic multiplicities sum to the 
ambient dimension (here 3), while the sum of geometric multiplicities falls short (here 2). 

9.4. Diagonalization of a Semisimple Matrix 

If rrij = 1 then we call Xj semisimple. If each rrij = 1 we call B semisimple. In this case we see 
that 

fcmN{\jI - B) = dimK(Pj) = rij 

and that 

h 

J2 n J = n - 

We then denote by Ej = [e^i ej t2 ■ ■ • e.j, nj ] a matrix composed of basis vectors of IZ(Pj). We note 
that 

Bej^k = ^j e j,k, 

and so 

BE = EA where E=\E X E 2 ■■■ E h \ (9.26) 
and A is the diagonal matrix of eigenvalues, 

A = diag(Aiones(rai, 1) A 2 ones(n 2 , 1) ■ • • \ h ones{nh, 1)). 

As the eigenmatrix, E, is square and composed of linearly independent columns it is invertible and 
so (9.26) may be written 

B = EAE~ 1 and A = E^ l BE. (9.27) 

In this sense we say that E diagonalizes B. Let us work out a few examples. With respect to the 
rotation matrix 

*-(-! i 

we recall, see (8.3), that 



1 (1/2 -i/2\ 1 / 1/2 i/2 



s-i \i/2 1/2 J ^ s + t \-i/2 1/2 

1 « + -^«, 



+ — (9.28) 



and so 



s — Ai s — A 
B = A1P1 + A 2 P 2 



1/2 -i/2\ _ . f 1/2 i/2 
i/2 1/2 J 1 I -i/2 1/2 
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As mi = m 2 = 1 we see that B is semisimple. By inspection we see that TZ{Pi) and 7£(P 2 ) are 
spanned by 



e\— ( \ ] and e 2 = 



respectively. It follows that 



y i -i r 2 U W \0 -i 



1(1 lWi 0W1 -j 
2 ! -i J V -i J V 1 i 



5=0 1 1 (9.29) 











Ms 


1 












and so 

B = EAE~ 

Consider next, 



and note that [E,L] = eig(B) returns 

(1 l/v/3 
E= 1 l/-s/3 
\0 l/>/3 

and so B = ELE~ l . We note that although we did not explicitly compute the resolvent we do 
not have explicit information about the orders of its poles. However, as Matlab has returned 2 
eigenvalues, A x = 1 and A 2 = 2, with geometric multiplicities rt\ — 2 (the first two columns of E 
are linearly independent) and n 2 = 1 it follows that m\ = m 2 = 1. 

9.5. The Characteristic Polynomial 

Our understanding of eigenvalues as poles of the resolvent has permitted us to exploit the beau- 
tiful field of complex integration and arrive at a (hopefully) clear understanding of the Spectral 
Representation. It is time however to take note of alternate paths to this fundamental result. In 
fact the most common understanding of eigenvalues is not via poles of the resolvent but rather as 
zeros of the characteristic polynomial. Of course an eigenvalue is an eigenvalue and mathematics 
will not permit contradictory definitions. The starting point is as above - an eigenvalue of the n- 
bj-n matrix B is a value of s for which (si — B) is not invertible. Of the many tests for invertibility 
let us here recall that a matrix is not invertible if and only if it has a zero pivot. If we denote the 
jth pivot of B by Pj(B) we may define the determinant of B as 

n 

det(fl) = l[ Pj (B) 

For example, the pivots of 
are a and d — be/ a and so 

det(.B) = ad — be. 
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Similarly, the pivots of 



sI-B = 



s — 



—c 



a 



-b 
s — d 



are s — a and (s — d) — be/ (s — a) and so 



Xb(s) = (s- a)(s 



d) - be. 



This is a degree-2 polynomial for the 2-by-2 matrix B. The formula that defines the determinant 
of a general n-by-n matrix, though complicated, always involves an alternating sum of products of 
n elements of B, and hence 



is a polynomial of degree n. It is called the characteristic polynomial of B. Since the zeros 
of Xb{s) are the points where det(s/ — B) = 0, those zeros are precisely the points where si — B 
is not invertible, i.e., the eigenvalues of B. The Fundamental Theorem of Algebra tells us that a 
degree-n polynomial has no more than n roots, and thus an n-by-n matrix can never have more 
than n distinct eigenvalues. In the language of §9.3, we must have h < n. Let us look at a few 
examples. First, consider the matrix from §9.5, 



confirming that the eigenvalues of B are ±i. 

For larger matrices we turn to Matlab . Its poly function directly delivers the coefficients of 
the characteristic polynomial. Returning to the 3-by-3 example of (9.1), we find 

>> B = [1 1 0; 1 0; 2] 

>> chi_B = poly (B) 

chi_B = 1-45-2 

>> roots (p) 

2 .0000 

1.0000 + O.OOOOi 
1.0000 - O.OOOOi 

which means Xb{s) = s 3 — 4s 2 + 5s — 2 = (s — l) 2 (s — 2) which is none other than the common 
denominator in our resolvent (9.2). It is interesting to contrast this with the example of (9.29). 
They have the same characteristic polynomials and therefore the same eigenvalues, although the 
latter is semisimple while the former is not. 

9.6. Exercises 



X b(s) = det(sl - B) 




The simple formula for the 2-by-2 determinant leads to 



det(s/ - B) = s 2 + 1 = (s - i)(s + i), 



1. Argue as in Proposition 9.1 that if j ^ k then DjP k = PjD k = 0. 

2. Argue from (9.24) and P 2 = Pj that DjPj = PjDj = Dj. 
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3. We suggested in Chapter 6 that x(t) = exp(Bt)x(0) is the solution of the dynamical system 
x'(t) = Bx{t). Let us confirm that our representation (for semisimple B) 

h 

exp(Bt) = ^exp(A j t)P i (9.30) 
j=i 

indeed satisfies 

(exp(-Bf))' = Bexp(Bt). 
To do this merely compare the t derivative of (9.30) with the product of 

h 

B = J2 \Pj 

and (9.30). 

4. Let us compute the matrix exponential, exp(Bt), for the nonsemisimple matrix B in (9.1) by 
following the argument that took us from (9.16) to (9.19). To begin, deduce from 

(zl — B) -1 = — -r-Pi + -, \ „ D 1 + ' ~ 



z — Ai (z — Ai) 2 z - A 2 
that 

(/ - B/z)- 1 = 1 P 1 + 1 D 1 + - 1 P 2 . 

1 — \ 1 /z z(l — Xi/zy 1 — X2/ z 

Next take explicit products of this matrix with itself and deduce (using exercise 2) that 

(/ " B/Z) ' k = (1- + zil-l/z)^ 1 + (1- W P2 ' 
Next set z = k/t and find that 

(/ " = (i-(t/*)Ai) fcPl + H/iF' 1 + (i - (t/k)\ 2 )* p *> 

and deduce that 

exp(Bt) = lim (7 - (t/k)B)~ k = exp(\ 1 t)P 1 + texp(\ 1 t)D 1 + exp(A 2 t)P 2 - 

k— >oo 

Finally, argue as in the previous exercise that (exp(Bt))' = Bexp(Bt). 

5. We would now like to argue, by example, that the Fourier Matrix of Exer. 7.7 diagonalizes 
every circulant matrix. We call this matrix 

/2 8 6 4\ 
4 2 8 6 
6 4 2 8 
\8 6 4 2 J 

circulant because each column is a shifted version of its predecessor. First compare the results 
of eig (B) and F£B(:, 1) and then confirm that 

B = F 4 diag(F 4 *5(:, l))F 4 */4. 

Why must we divide by 4? Now check the analogous formula on a circulant 5-by-5 circulant 
matrix of your choice. Submit a marked-up diary of your computations. 
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6. Let us return exercise 6.7 and study the eigenvalues of B as functions of the damping d when 
each mass and stiffness is 1. In this case 



B 



-dS 



where S 



2 


-1 


-1 


2 -1 





-1 2 



(i) Write and execute a Matlab program that plots, as below, the 6 eigenvalues of B as d 
ranges from to 1.1 in increments of 0.005. 





Figure 9.1. Trajectories of eigenvalues of the damped chain as the damping increased. 

(ii) Argue that if [u; v] T is an eigenvalue of B with eigenvalue A then v — Xu and — Su — dSv = 
Xv. Substitute the former into the latter and deduce that 



Su 



-A 2 
1 + dX 



u. 



(iii) Confirm, from Exercise 7.10, that the eigenvalues of S are /ix = 2 + y/2, /x 2 = 2 and 
/i 3 = 2 — \pl and hence that the six eigenvalues of B are the roots of the 3 quadratics 



A 2 + djijX + fij = 0, i.e., A±j 



-dfij ± ^{dfij) 2 - Afij 



Deduce from the projections in Exer. 7.10 the 6 associated eigenvectors of B. 

(iv) Now argue that when d obeys (dfij) 2 = 4fj,j that a complex pair of eigenvalues of B collide 
on the real line and give rise to a nonsemisimple eigenvalue. Describe Figure 9.1 in light of 
your analysis. 
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10. The Spectral Representation of a Symmetric Matrix 

Our goal in this chapter is to show that if B is symmetric then 

• each eigenvalue, Xj, is real, 

• each eigenprojection, Pj, is symmetric and 

• each eigennilpotent, Dj, vanishes. 

Let us begin with an example. The resolvent of 



is 



R(s) 




1/3 1/3 1/3 N 
1/3 1/3 1/3 
1/3 1/3 1/3, 



s — Xi s - 

and so indeed each of the bullets holds true. With each of the Dj falling by the wayside you may 
also expect that the respective geometric and algebraic multiplicities coincide. 

10.1. The Spectral Representation 

We begin with 

Proposition 10.1. If B is real and B = B T then the eigenvalues of B are real. 

Proof: We suppose that A and x comprise an eigenpair of B, i.e., Bx = Xx and show that A = A. 
On conjugating each side of this eigen equation we find 

Bx = Xx and Bx = Xx (10.1) 

where we have used the fact that B is real. We now take the inner product of the first with x and 
the second with x and find 

x T Bx = X\\x\\ 2 and x T Bx = X\\x\\ 2 . (10.2) 
The left hand side of the first is a scalar and so equal to its transpose 

x T Bx = (x T Bx) T = x T B T x = x T Bx (10.3) 
where in the last step we used B = B T . Combining (10.2) and (10.3) we see 

A||x|| 2 = A||x|| 2 . 

Finally, as \\x\\ > we may conclude that A = A. End of Proof. 
We next establish 

Proposition 10.2. If B is real and B = B T then each eigenprojection, Pj, and each eigennilpotent, 
Dj, is real and symmetric. 
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Proof: From the fact (please prove it) that "the transpose of the inverse is the inverse of the 
transpose" we learn that 

{(si - B)~Y = {(sl- B) T }~i = (si - fi)- 1 
i.e., the resolvent of a symmetric matrix is symmetric. Hence 

= — [ (si -B)- 1 ds = P j . 

By the same token, we find that each Dj = Dj. The elements of Pj and Dj are real because they 
are residues of real functions evaluated at real poles. End of Proof. 

The next result will banish the nilpotent component. 

Proposition 10.3. The zero matrix is the only real symmetric nilpotent matrix. 

Proof: Suppose that D is n-by-n and D = D T and D m = for some positive integer m. We 
show that D m ~ l = by showing that every vector lies in its null space. To wit, if x G R™ then 

WD^xf = x T (D m - l ) T D m - 1 x 
= x T D m ~ 1 D m ~ 1 x 
= x T D rn ~ 2 D m x 
= 0. 

As D m ~ x x = for every x it follows (recall Exercise 3.3) that D m ~ x = 0. Continuing in this fashion 
we find D m ~ 2 = and so, eventually, D = 0. End of Proof. 

We have now established the key result of this chapter. 

Proposition 10.4. If B is real and symmetric then 

h 

B = Y, X i p j ( 10 - 4 ) 

where the \j are real and the Pj are real orthogonal projections that sum to the identity and whose 
pairwise products vanish. 

One indication that things are simpler when using the spectral representation is 

h 

B W0 = J2 X f° P r ( 10 - 5 ) 

3=1 

As this holds for all powers it even holds for power series. As a result, 

h 

exp(B) = ^exp(A i )P j . 

3=1 
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It is also extremely useful in attempting to solve 

Bx = b 

for x. Replacing B by its spectral representation and b by lb or, more to the point by J2j Pjb we 
find 

h h 

Ew E '' h - 

3=1 3=1 

Multiplying through by Pi gives \iP\X = P\b or P ± x = Pib/X 1 . Multiplying through by the 
subsequent PjS gives PjX = Pjb/\j. Hence, 

h h 



x ^' J ~ ^\ 

3=1 3=1 3 

We clearly run in to trouble when one of the eigenvalues vanishes. This, of course, is to be expected. 
For a zero eigenvalue indicates a nontrivial null space which signifies dependencies in the columns 
of B and hence the lack of a unique solution to Bx = b. 

Another way in which (10.6) may be viewed is to note that, when B is symmetric, (9.15) takes 
the form 



{zI-BY^j^-^P,. 



i z ~ \ 

3=1 J 

Now if is not an eigenvalue we may set z = in the above and arrive at 

h 



3=1 J 



A 

3=1 

Hence, the solution to Bx = b is 



x = B-'b = y Pjb, 



3=1 3 

as in (10.6). With (10.7) we have finally reached a point where we can begin to define an inverse 
even for matrices with dependent columns, i.e., with a zero eigenvalue. We simply exclude the 
offending term in (10.7). Supposing that A/, = we define the pseudo-inverse of B to be 



h-l 



3=1 



Let us now see whether it is deserving of its name. More precisely, when b G TZ(B) we would expect 
that x = B + b indeed satisfies Bx = b. Well 

h-l h-l h-l h-l 

BB+b = BJ2 yPjb = E Y BP]h = E y XjPjb = E P i b - 

3=1 3 3=1 3 3=1 3 3=1 

It remains to argue that the latter sum really is b. We know that 

h 

b = ^2Pjb and beK(B). 

3=1 
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The latter informs us that b _L M{B T ). As B = B T we have in fact that b _L M{B). As P^ is 
nothing but orthogonal projection onto A/"(P) it follows that P^fe = and so B(B + b) = b, that is, 
x = B + b is a solution to Pre = b. 

The representation (10.5) is unarguably terse and in fact is often written out in terms of individual 
eigenvectors. Let us see how this is done. Note that if x G TZ(Pi) then x = P±x and so 

h 

Bx = BP\x = XjPjPix = \\P\X = X±x, 

i.e., x is an eigenvector of P associated with Ai. Similarly, every (nonzero) vector in TZ(Pj) is an 
eigenvector of P associated with Xj. 

Next let us demonstrate that each element of IZ(Pj) is orthogonal to each element of TZ(P k ) when 
j ^k. If x G TZ(Pj) and y G TZ(P k ) then 

x T y = (Pjx) T P k y = x T PjP k y = 0. 

With this we note that if {2^1,2^2, . • • ,Xj, nj } constitutes a basis for TZ(Pj) then in fact the union 
of such bases, 

{x j)P :l<j<h, l<p< rij}, (10.8) 
forms a linearly independent set. Notice now that this set has 

h 

i=i 

elements. That these dimensions indeed sum to the ambient dimension, n, follows directly from the 
fact that the underlying Pj sum to the n-by-n identity matrix. We have just proven 

Proposition 10.5. If P is real and symmetric and n-by-n then P has a set of n linearly independent 
eigenvectors. 

Getting back to a more concrete version of (10.5) we now assemble matrices from the individual 

bases 

Ej = [Xj : i Xj : 2 ■ ■ ■ Xj, nj ] 

and note, once again, that Pj = Ej(EjE j )~ 1 Ej, and so 

h 

B = J2^E J (EjE j )- 1 Ef. (10.9) 

I understand that you may feel a little underwhelmed with this formula. If we work a bit harder 
we can remove the presence of the annoying inverse. What I mean is that it is possible to choose a 
basis for each TZ(Pj) for which each of the corresponding Ej satisfy EjEj = I. As this construction 
is fairly general let us devote a separate section to it. 

10.2. Gram— Schmidt Orthogonalization 

Suppose that M is an m-dimensional subspace with basis 

{x\ ) . . . , x m }. 
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We transform this into an orthonormal basis 

{91, •■ • ,1m} 

for M via 

(GS1) Set yi = xi and ^ = yi/||yi||. 

(GS2) 1/2 = x 2 minus the projection of x 2 onto the line spanned by qi. That is 

y 2 =x 2 - qi(qiqi)~ 1 qiX 2 = x 2 - q x q\x 2 . 
Set q 2 = y 2 /\\y 2 \\ and Q 2 = [g x g 2 ]. 

(GS3) y 3 = x 3 minus the projection of x 3 onto the plane spanned by q± and g 2 - That is 

V3 = X3- Q2(Q 2 Q2)~ 1 Q 2 x 3 
= x 3 - qiqjx 3 - q 2 q 2 x 3 . 

Set q 3 = 2/3/ II2/3 1| and Q 3 = g 2 93]- 
Continue in this fashion through step 

(GSm) y m = x m minus its projection onto the subspace spanned by the columns of Q m -i- That is 

Vm %rn Qm— 1 (Q m — \Qm— l) Qm—l^m 
m—1 

ET 

3=1 

Set q m = y m /||y m ||. 

To take a simple example, let us orthogonalize the following basis for R 3 , 

X! = [1 0] T , x 2 = [1 1 0] T , x 3 =[ll If. 

(GS1) qi = yi = xi. 

(GS2) y 2 = x 2 - q\q\x 2 = [0 1 0] T , and so q 2 = y 2 . 

(GS3) y 3 = x 3 - gigfx 3 - q 2 q\x 3 = [0 and so q 3 = y 3 . 

We have arrived at 

9i = [1 0] r , q 2 = [0 1 0] T , g 3 = [0 1] T . (10.10) 

Once the idea is grasped the actual calculations are best left to a machine. Matlab accomplishes 
this via the orth command. Its implementation is a bit more sophisticated than a blind run through 
our steps (GS1) through (GSm). As a result, there is no guarantee that it will return the same 
basis. For example 

» X=[l 1 1; 1 1; 1]; 

>> Q=orth(X) 

Q = 

0.7370 -0.5910 0.3280 
0.5910 0.3280 -0.7370 
0.3280 0.7370 0.5910 

This ambiguity does not bother us, for one orthogonal basis is as good as another. We next put 
this into practice, via (10.9). 
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10.3. The Diagonalization of a Symmetric Matrix 

By choosing an orthogonal basis {g^ '■ 1 < k < rij} for each TZ(Pj) and collecting the basis 
vectors in 

Qj = [Qj,i Qj,2 ■ ■ ■ Qj, nj ] 

we find that 



Pj = QjQj = E^J fe - 



k=i 

As a result, the spectral representation (10.9) takes the form 

h h n j 

b = j2 \QjQ T j = E E no.iD 

3=1 3=1 k=l 

This is the spectral representation in perhaps its most detailed dress. There exists, however, still 
another form! It is a form that you are likely to see in future engineering courses and is achieved 
by assembling the Qj into a single n-by-n orthonormal matrix 

Q = [Qi ■■■ Qh}- 

Having orthonormal columns it follows that Q T Q = I. Q being square, it follows in addition that 
Q T = Q-\ Now 

Bq jjk = ><jqj,k 

may be encoded in matrix terms via 

BQ = QA (10.12) 

where A is the n-by-n diagonal matrix whose first ri\ diagonal terms are Ai, whose next n-i diagonal 
terms are A2, and so on. That is, each Xj is repeated according to its multiplicity. Multiplying each 
side of (10.12), from the right, by Q T we arrive at 



B = QAQ T . (10.13) 

Because one may just as easily write 



Q T BQ = A (10.14) 



one says that Q diagonalizes B. 
Let us return to our example 




of the last chapter. Recall that the eigenspace associated with Ai = had 

ei,i = [-1 1 0] T and e 1;2 = [-1 1] T 
for a basis. Via Gram-Schmidt we may replace this with 

g lil = -L[-l 10] T and g li2 = -L[-l -12] T . 
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Normalizing the vector associated with A 2 = 3 we arrive at 



* = Ts 11 1 1]T ' 



and hence 

n = \n, , n, „ aJ = 

V6 




Q = [qi,i 9i,2 q 2 ] = -^[ V3 -1 y/2 \ and A 




10.4. Rayleigh's Principle and the Power Method 

As the eigenvalues of an n-bj-n symmetric matrix, B, are real we may order them from high to 
low, 

Ai > A 2 > ••• > A n . 

In this section we will derive two extremely useful characterizations of the largest eigenvalue. The 
first was discovered by Lord Rayleigh in his research on sound and vibration. To begin, we denote 
the associated orthonormal eigenvectors of B by 

9l, 92, 9n 

and note that each x G R™ enjoys the expansion 

x = (x T q 1 )q 1 + (x T q 2 )q 2 H h (x T q n )q n - (10.15) 

Applying B to each side we find 

Bx = (x T q 1 )\iqi + (x T g 2 )A 2 92 H V {x T q n )\ n q n . (10.16) 

Now taking the inner product of (10.15) and (10.16) we find 

x T Bx = (x T q l ) 2 X l + (x T q 2 ) 2 X 2 + ■■■ + (x T q n ) 2 X n 

< Xi{(x T q 1 ) 2 + (x T q 2 ) 2 H h (x T q n ) 2 } 

= \ix T x. 

That is, x T Bx < \ix T x for every x G R". This, together with the fact that qjBqi = Xiqjqi 
establishes 

Proposition 10.6. Rayleigh's Principle. If B is symmetric then its largest eigenvalue is 

x T Bx 
Ai = max — = — 

x^0 X 1 X 

and the maximum is attained on the line through qi . 

For our next characterization we return to (10.16) and record higher powers of B onto x 

B k x = (x T q 1 )\ k 1 q 1 + (x T q 2 )\ k 2 q 2 + ■■■ + (x T q n )X k n q n 

-(x T a)X k (a + X ^ X K + + ^^ k n\ (10-17) 
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And so, using sign(t) = t/\t\, 



/; '' sign(\ k l x T q 1 )q 1 + 0((X 2 /X l ) k ) 



\\B*x\\ 

and note that the latter term goes to zero with increasing k so long as Ai is strictly greater than 
A2. If, in addition, we assume that Ai > 0, then the first term does not depend on k and we arrive 
at 

Proposition 10.7. The Power Method. If B is symmetric and its greatest eigenvalue is simple 
and positive and the initial guess, x, is not orthogonal to q\ then 

i^p4 = sign(xTgi)gi 

10.5. Exercises 

1. The stiffness matrix associated with the unstable swing of figure 2.2 is 

/ 1 -1 0\ 
10 
-10 10 
\ 1/ 

(i) Find the three distinct eigenvalues, Ai = 1,A 2 = 2,A 3 = 0, along with their associated 
eigenvectors 61,1,61,2,62,1,63,1, and projection matrices, Pi,P 2 ,Ps. What are the respective 
geometric multiplicities? 

(ii) Use the folk theorem that states "in order to transform a matrix it suffices to transform its 
eigenvalues" to arrive at a guess for S 1 ^ 2 . Show that your guess indeed satisfies S 1 / 2 S 1 / 2 = S. 
Does your S 1 ^ 2 have anything in common with the element-wise square root of S? 

(iii) Show that TZ(P 3 ) = J\f(S). 

(iv) Assemble 

S + = ^Pi + ^P 2 
and check your result against pinv (S) in MATLAB . 

(v) Use S + to solve Sx = f where / = [0 1 2] T and carefully draw before and after pictures 
of the unloaded and loaded swing. 

(vi) It can be very useful to sketch each of the eigenvectors in this fashion. In fact, a movie is 
the way to go. Please run the Matlab truss demo by typing truss and view all 12 of the 
movies. Please sketch the 4 eigenvectors of (i) by showing how they deform the swing. 

2. The Cayley-Hamilton Theorem. Use Proposition 10.4 to show that if B is real and symmetric 
and p(z) is a polynomial then 

h 

p(B) =^p(A i )P i . (10.18) 

j'=i 

Confirm this in the case that B is the matrix in Exercise 1 and p(z) — z 2 + 1. Deduce from 
(10.18) that if p is the characteristic polynomial of B, i.e., p(z) = det(zl — B), then p(B) = 0. 
Confirm this, via the Matlab command, poly, on the S matrix in Exercise 1. 
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3. Argue that the least eigenvalue of a symmetric matrix obeys the minimum principle 



A n = min^^ (10.19) 

x^O X 1 X 

and that the minimum is attained at q n . 

4. The stiffness of a fiber net is typically equated with the least eigenvalue of its stiffness matrix. 
Let B be the stiffness matrix constructed in Exercise 2 of §2.5, and let C denote the stiffness 
matrix of the same net, except that the stiffness of fiber 4 is twice that used in constructing 
B. Show that x T Bx < x T Cx for all x G R 4 and then deduce from (10.19) that the stiffness of 
the C network exceeds the stiffness of the B network. 

5. The Power Method. Submit a Matlab diary of your application of the Power Method to the 
S matrix in Exercise 1. 
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11. The Singular Value Decomposition 



11.1. Introduction 

The singular value decomposition is, in a sense, the spectral representation of a rectangular 
matrix. Of course if A is m-bj-n and m ^ n then it does not make sense to speak of the eigenvalues 
of A. We may, however, rely on the previous section to give us relevant spectral representations of 
the two symmetric matrices 

A T A and AA T . 

That these two matrices together may indeed tell us 'everything' about A can be gleaned from 

M(A T A) = Af(A), M{AA T ) = M(A T ) ) 
1Z(A T A) = TZ(A T ), and 1Z(AA T ) = 71(A). 

You have proven the first of these in a previous exercise. The proof of the second is identical. The 
row and column space results follow from the first two via orthogonality. 

On the spectral side, we shall now see that the eigenvalues of AA T and A T A are nonnegative 
and that their nonzero eigenvalues coincide. Let us first confirm this on the A matrix associated 
with the unstable swing (see figure 2.2) 
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Analysis of the first is particularly simple. Its null space is clearly just the zero vector while Ai = 2 
and A2 = 1 are its eigenvalues. Their geometric multiplicities are ri\ = 1 and ni = 2. In A T A we 
recognize the S matrix from exercise 10.1 and recall that its eigenvalues are Ai = 2, A2 = 1, and 
A3 = with multiplicities n\ = 1, 112 = 2, and = 1. Hence, at least for this A, the eigenvalues of 
AA T and A T A are nonnegative and their nonzero eigenvalues coincide. In addition, the geometric 
multiplicities of the nonzero eigenvalues sum to 3, the rank of A. 

Proposition 11.1. The eigenvalues of AA T and A T A are nonnegative. Their nonzero eigenvalues, 
including geometric multiplicities, coincide. The geometric multiplicities of the nonzero eigenvalues 
sum to the rank of A. 

Proof: If A T Ax = \x then x T A T Ax = Xx T x, i.e., ||Ar|| 2 = A||x|| 2 and so A > 0. A similar 
argument works for AA T . 

Now suppose that \j > and that {xj,kYk=i constitutes an orthogonal basis for the eigenspace 
1Z(Pj). Starting from 

A T Ax jik = XjX jyk (11-2) 
we find, on multiplying through (from the left) by A that 



A A Axj^ — XjAxj^i 
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i.e., Xj is an eigenvalue of AA T with eigenvector Axj^, so long as Axj^ 7^ 0. It follows from the 
first paragraph of this proof that HAr^H = \fX~j, which, by hypothesis, is nonzero. Hence, 



Vj,k = l<k<nj (11.3) 

is a collection of unit eigenvectors of AA T associated with Xj. Let us now show that these vectors 
are orthonormal for fixed j. 

1 „T A T „„ ,T 



V3,iVo\k = \~ X j,i A Ax 3,k = x j,i x j,k = 0. 



We have now demonstrated that if Xj > is an eigenvalue of A A of geometric multiplicity rij 
then it is an eigenvalue of AA T of geometric multiplicity at least rij. Reversing the argument, i.e., 
generating eigenvectors of A T A from those of AA T we find that the geometric multiplicities must 
indeed coincide. 

Regarding the rank statement, we discern from (11.2) that if Xj > then Xj^ G TZ(A T A). The 
union of these vectors indeed constitutes a basis for 1Z(A T A), for anything orthogonal to each of 
these Xj t k necessarily lies in the eigenspace corresponding to a zero eigenvalue, i.e., in J\f(A T A). As 
TZ(A T A) = K(A T ) it follows that &im.K(A T A) = r and hence the rij, for Xj > 0, sum to r. End of 
Proof. 

Let us now gather together some of the separate pieces of the proof. For starters, we order the 
eigenvalues of A T A from high to low, 

Ai > A 2 > • • • > X h 

and write 

A T A = XA n X T (11.4) 

where 

X = [X x ■ ■ ■ X h \ , where Xj = [x jt i ■ ■ ■ Xj, nj \ 

and A n is the n-by-n diagonal matrix with Ai in the first n\ slots, A 2 in the next n 2 slots, etc. 
Similarly 

AA T = YA m Y T (11.5) 

where 

Y = [Y 1 ---Y h \, where Yj = [y jA ■ ■ ■ y j>nj }. 

and A m is the m-by-m diagonal matrix with Ai in the first n\ slots, A 2 in the next n 2 slots, etc. The 
Uj t k were defined in (11.3) under the assumption that Xj > 0. If Xj = let Yj denote an orthonormal 
basis for M(AA T ). Finally, call 

Gj = VXj 

and let E denote the m-by-n matrix diagonal matrix with <j\ in the first ri\ slots and cr 2 in the next 
n 2 slots, etc. Notice that 

S T S = A n and SS T = A m . (11.6) 
Now recognize that (11.3) may be written 

Axj y k = &jDj,k 
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and that this is simply the column by column rendition of 



AX = YE. 



As XX T = I we may multiply through (from the right) by X T and arrive at the singular value 
decomposition of A, 

A = YT>X T . 



(11.7) 



Let us confirm this on the A matrix in (11.1). We have 
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and X = 
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Hence, 
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(11.8) 



0, 



and so A = YEX T says that A should coincide with 




y/2 N 
10 
1 ; 



f-l/y/2 I /y/2 0\ 

10 

1 

V I /y/2 1/v^ 07 



This indeed agrees with A. It also agrees (up to sign changes in the columns of X) with what one 
receives upon typing [Y, SIG, X] =svd (A) in MATLAB . 

You now ask what we get for our troubles. I express the first dividend as a proposition that 
looks to me like a quantitative version of the fundamental theorem of linear algebra. 

Proposition 11.2. If YEX T is the singular value decomposition of A then 

(i) The rank of A, call it r, is the number of nonzero elements in E. 

(ii) The first r columns of X constitute an orthonormal basis for 1Z(A T ). The n — r last columns 
of X constitute an orthonormal basis for M{A). 

(iii) The first r columns of Y constitute an orthonormal basis for TZ(A). The m — r last columns of 
Y constitute an orthonormal basis for M{A T ). 

Let us now 'solve' Ax = b with the help of the pseudo-inverse of A. You know the 'right' thing 
to do, namely reciprocate all of the nonzero singular values. Because m is not necessarily n we must 
also be careful with dimensions. To be precise, let E + denote the n-by-m matrix whose first n\ 
diagonal elements are l/<7i, whose next n,2 diagonal elements are I/02 and so on. In the case that 
ah = 0, set the final rih diagonal elements of E + to zero. Now, one defines the pseudo-inverse of 
A to be 

A + = XE + Y T . 
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Taking the A of (11.1) we find 
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in agreement with what appears from pinv (A) . Let us now investigate the sense in which A + is 
the inverse of A. Suppose that b e R m and that we wish to solve Ax = b. We suspect that A + b 
should be a good candidate. Observe now that 

(A T A)A + b = XKX T XY, + Y T b by (11.4) 

= XA n Z + Y T b because X T X = I 
= XS T SS+F T 6 by (11.6) 
= XZ T Y T b because S T SS+ = S T 
= A T b by (11.7), 



that is, A + b satisfies the least-squares problem A T Ax = A T b. 
11.2. The SVD in Image Compression 

Most applications of the SVD are manifestations of the folk theorem, The singular vectors asso- 
ciated with the largest singular values capture the essence of the matrix. This is most easily seen 
when applied to gray scale images. For example, the jpeg associated with the image in the top 
left of figure 11.2 is 262-by-165 matrix. Such a matrix-image is read, displayed and 'decomposed' by 
M = imread ( ' jb . jpg' ) ; imagesc (M) ; colormap (gray) ; 
[Y,S,X] = svd (double (M) ) ; 

The singular values lie on the diagonal of S and are arranged in decreasing order. We see their 
rapid decline in Figure 11.1. 
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Figure 11.1. The singular values of John Brown. 

We now experiment with quantitative versions of the folk theorem. In particular, we examine the 
result of keeping but the first k singular vectors and values. That is we construct 

Ak = Y ( : , 1 : k) *S (1 : k, 1 : k) *X ( : , 1 : k) '; 
for decreasing values of k. 




II B ■ 11 



Figure 11.2. The results of imagesc (Ak) for, starting at the top left, k=165, 64, 32 and 
moving right, and then starting at the bottom left and moving right, k=2 4, 20, 16. 
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11.3. Trace, Norm and Low Rank Approximation 

In this section we establish a precise version of last section's folk theorem. To prepare the way 
we will need the notion of matrix trace and norm. The trace of an n-bj-n matrix B is simply the 
sum of its diagonal elements 

n 

j'=i 

It enjoys a very useful property, namely, if A is also n-by-n then 

tr(AB) = tr(BA). (11.10) 

From here we arrive at a lovely identity for symmetric matrices. Namely, if B = B T and we list its 
eigenvalues, including multiplicities, {Ai, A2, . . . , A„} then, from (10.13) it follows that 

n 

tr(B) = tr{QAQ T ) = tr(AQQ T ) = tr(A) = V ( 1L11 ) 

You might wish to confirm that this holds for the many examples we've accumulated. From here 
we may now study the properties of the natural, or Frobenius, norm of an m-by-n matrix A, 

i=i 3=1 / 

It is not difficult to establish that in fact 

\\A\\ F = (tr(AA T )) 1/2 (11.13) 
from which we may deduce from (11.11) that 



\a\\1 = J2Haa t ) = J2*1 



i=l i=l 

i.e., the Frobenius norm of a matrix is the square root of the sum of the squares of its singular 
values. We will also need the fact that if Q is square and Q T Q = I then 

\\QA\\ 2 F = tr(QAA T Q T ) = tr(AA T Q T Q) = tr(AA T ) = \\A\\ 2 F . (11.14) 

The same argument reveals that ||^4(5||f — \\A \\f- We may now establish 

Proposition 11.3. Given an m-by-n matrix A (with SVD A = YY,X T ) and a whole number 
k < min{m, n} then the best (in terms of Frobenius distance) rank k approximation of A is 

A k = Y(:, 1 : fc)E(l : k, 1 : k)X(:, 1 : kf . 

The square of the associated approximation error is 

\\A-A k \\ 2 F = mm \\A-B\\ F = Y.°l 

Proof: If B is m-by-n then 

\\A - B\\ 2 F = ||FSX T - B\\ 2 F = ||y(E - Y T BX)X T \\ 2 F = ||E - Y T BX\\ 2 F . 



If B is to be chosen, among matrices of rank k, to minimize this distance then, as S is diagonal, so 
too must Y T BX. If we denote this diagonal matrix by S then 



Y T BX = S implies B = YSX T , 

and so 



\A-B\\% = ^i- 



8j, 



.1 

As the rank of B is k it follows that the best choice of the Sj is Sj = a 3 - for j = 1 : k and Sj = 
there after. End of Proof. 

11.4. Exercises 



1. Suppose that A is m-by-n and b is m-by-1. Set x + = A + b and suppose x satisfies A T Ax = A T b. 
Prove that < (Hint: decompose x = xr + xn into its row space and null space 
components. Likewise x + = x^ + x^. Now argue that xr = x\ and = and recognize that 
you are almost home.) 

2. Experiment with compressing the bike image below (also under Resources on our Owlspace 
page). Submit labeled figures corresponding to several low rank approximations. Note: 
bike . jpg is really a color file, so after saving it to your directory and entering Matlab you 
might say M = imread ( ' bike . jpg' ) and then M = M ( : , : , 1 ) prior to image sc (M) 
and colormap ( ' gray' ) . 




3. Please prove the identity (11.10). 

4. Please prove the identity (11.13). 
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12. Reduced Dynamical Systems 



We saw in the last chapter how to build low rank approximations of matrices using the Singular 
Value Decomposition, with applications to image compression. We devote this chapter to building 
reduced models of dynamical systems from variations of the power method, with applications to 
neuronal modeling. 

12.1. The Full Model 

We consider distributed current injection into passive RC cable of Chapter 6. We suppose it has 
length I and radius a and we choose a compartment length dx and arrive at n = i/dx compartments. 
The cell's axial resistivity is R. Its membrane capacitance and conductance are C m and Gl- 




Figure 12.1. The full RC cable. 

If Uj is the synaptic current injected at node j and v j is the transmembrane potential there then 
(arguing as in Chapter 6) v obeys the differential equation 



Cv'(t) + Gv(t) =u{t), 



;i2.1) 



where 



C = (2iradxC rn )I n , G = {2-nadxG L )I n - ^^n, 



/„ is the n-by-n identity matrix and S n is the n-by-n second-difference matrix 
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(12.2) 



We divide through by 2nadxC m and arrive at 

v'(t)=Bv(t)+g(t) 

where 



B = 2RC dx2 S n -(G L /C m )I n and g = u/(2nadxC m ). (12.3) 

We will construct a reduced model that faithfully reproduces the response at a specified compart- 
ment (the putative spike initiation zone), assumed without loss to be the first, 

y (t) = q T v(t), where q T = (1 • • • 0), 

to an arbitrary stimulus, u. On applying the Laplace Transform to each side of (12.1) we find 

sCv = BCv + Cg 
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and so the output 

Cy = q T Cv = q T {sI - B)~ 1 Cg = H(s)Cg 
may be expressed in terms of the transfer function 

H(s) = q T (sI -B)-\ 



12.2. The Reduced Model 

We choose a reduced dimension k and an n x k orthonormal reducer X, suppose v = Xv and 
note that v is governed by the reduced system 

v'(t) = X T BXv(t) + X T g(t), 
y(t) = q T Xv(t), 
We choose X so that the associated transfer function, 

H(s) = q T X(sI k - X T BX)- 1 X T , 

is close to H. We will see that it suffices to set 

X(:,l)= fry p- 1 ^, 
X(:,j + l)=B- 1 X(:,j)/\\B- 1 X(:,j)\\ 

followed by orthonormalization, 

X = orth(X). 

We note that (12.4) is nothing more than the Power Method applied to -B -1 , and so as j increases the 
associated column of X approaches the (constant) eigenvector associated with the least eigenvalue 
of B. We apply this algorithm and then return to prove that it indeed makes H close to H. 



j = l,...,k - 1, 



;i2.4) 
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Figure 12.2. Response of the first compartment in a 100 compartment cell and the corresponding 
3 compartment reduced cell to identical current injections distributed randomly in space and time, 
cabred . m 

Now, in order to understand why this method works so well we examine the associated transfer 
functions. In particular, we examine their respective Taylor series about s = 0. To begin, we note 
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that the first resolvent identity, (9.8), permits us to evaluate 

#(.) = Urn *(' + *>-*('> = Mm (-*>*(' + *)«(') = -fl» W . 

and so 

R"(s) = -2R(s)R'(s) = 2R 3 (s), R"'(s) = QR 2 (s)R'(s) = -6R 4 (s), 

and in general 

^ = (-iyj\R^(s) and so d ^ = -j\{B-y^ 

and so 

oo 

H(s) = q T R{s) = -J2 gjM v where M i = q T {B~ 1 ) j+1 

3=0 

are the associated moments. The reduced transfer function is analogously found to be 

oo 

H(s) = -J2 sj Mj, where Mj = q T X((X T BX)- 1 ) j+1 X T . (12.5) 

3=0 

If Mj = Mj for < j < k then H(s) = H(s) + 0(s fe ) and so we now show that the reducer built 
according to (12.4) indeed matches the first k moments. 

Proposition 12.1. If (B-y +1 q G K(X) for < j < k then Mj = Mj for < j < k. 
Proof: We will show that Mj = Mj , beginning with j = where 

M T = B~ x q and M T = X(X T BXy 1 X T q. 
We note that P = X(X T BXy 1 X T B obeys P 2 = P and PX = X and proceed to develop 

M T = X(X T BX)- 1 X T q 

= X(X T BX)- 1 X T BB- l q 
= PB^q = B~ x q = M T , 

where the penultimate equality follows from the assumption that B~ l q e H(X) and the fact that 
P acts like the identity on X. When j = 1 we find 

M? = X(X T BXy l (X T BX)- l X T q 

= X(X T BXy 1 X T BB- 1 X(X T BXy 1 X T BB- 1 q 
= PB- X PB- X q = {B- V ) 2 q = M? 

where the penultimate equality follows from the assumption that {B~ x ) 2 q e TZ(X). The remaining 
moment equalities follow in identical fashion. End of Proof. 
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12.3. Exercises 

1. Show that the n-bj-n second difference S n in (12.2) obeys 

n-l 

x S n x = — ^ ~ x j+i) 

3=1 

for every x G R n . Why does it follow that the eigenvalues of S n are nonpositive? Why does 
it then follow that eigenvalues of the B matrix in (12.3) are negative? If we then label the 
eigenvalues of B as 

K{B) < Xn-i(B) <■■■< X 2 (B) < X 1 (B) < 0, 
which eigenvector of B does (12.4) approach as j increases? 

2. Please derive the Taylor expansion displayed in (12.5). 

3. Please confirm that X(X T BX)~ 1 X T B is indeed projection onto 1Z(X). 
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